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(57) Abstract 



A method oflocatmg an mhibitory/ instability sequence or sequences within the coding region of an mRNA and modifying 
the gene encodmg that mRNA to remove these inhibitory/instability sequences by malting clustered nucleotide substitutions with- 
^^25.** . 8 TTC £ e ^"V 3 * 50105 " 1 - Constructs containing these mutated genes and host cells containing 
these constructs are also disclosed. The method and constructs are exemplified by the mutation of a Human Immunodeficiency 
Virus- 1 Rev-dependent gag gene to a Rev-independent gag gene. Constructs useful in locating inhibitory/instability sequences 
withm either the coding region or the 3' untranslated region of an mRNA are also disclosed. The exemplified constructs of the 
invention may also be useful in HIV-1 immunotherapy and immunoprophylaxis wiwiruws 
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METHOD OF ELIMINATING 
INHIBITORY/INSTABILITY PBnT 0N5 p g mPwa 

5 This application is a continuation-in-part of 

U.S. Serial No. 07/858,747, filed March 27, 1992. 

I- TECHNICAL PTlflT.n 

The invention relates to methods of increasing 
lQ the stability and/or utilization of a mRNA produced by a 
gene by mutating regulatory or inhibitory/ instability 
sequences (INS) in the coding region of the gene which 
prevent or reduce expression. The invention also relates 
to constructs, including expression vectors, containing 
15 genes mutated in accordance with these methods and host 
cells containing these constructs. 

The methods of the invention are particularly 
useful for increasing the stability and/or utilization of 
a mRNA without changing its protein coding capacity. 
20 These methods are useful for allowing or increasing the 

expression of genes which would otherwise not be expressed 
or which would be poorly expressed because of the presence 
of INS regions in the mRNA transcript. Thus, the methods, 
constructs and host cells of the invention are useful for 
25 increasing the amount of protein produced by any gene 

which encodes an mRNA transcript which contains an INS. 

The methods, constructs and host cells of the 
invention are useful for increasing the amount of protein 
produced from genes such as those coding for growth 
3Q . factors, interferons, interleukins , the fos proto- oncogene 
protein, and HIV-i gag and env, for example. 

The invention also relates to using the 
constructs of the invention in immunotherapy and 
immunoprophylaxis, e.g., as a vaccine, or in genetic 
35 therapy after expression in humans. Such constructs can 
include or be incorporated into retroviral or other 



WO 93/20212 



PCT/US93/02908 



- 2 - 



10 



15 



20 



25 



30 



35 



expression vectors or they may also be directly injected 
into tissue cells resulting in efficient expression of the 
encoded protein or protein fragment. These constructs may 
also be used for in-vi,vo or in-vjtro gene replacement, 
e.g., by homologous recombination with a target gene in- 
situ. 

The invention also relates to certain 
exemplified constructs which can be used to simply and 
rapidly detect and/or define the boundaries of 
inhibitory/instability sequences in any mRNA, methods of 
using these constructs, and host cells containing these 
constructs. Once the INS regions of the mRNAs have been 
located and/or further defined, the nucleotide sequences 
encoding these INS regions can be mutated in accordance 
with the method of this invention to allow the increase in 
stability and/or utilization of the mRNA and, therefore, 
allow an increase in the amount of protein produced from 
expression vectors encoding the mutated mRNA. 

II . BACKGROTIWD RT^T 

While much work has been devoted to studying 
transcriptional regulatory mechanisms, it has become 
increasingly clear that post -transcriptional processes 
also modulate the amount and utilization of RNA produced 
from a given gene. These post -transcriptional processes 
include nuclear post -transcriptional processes {e.g., 
splicing, polyadenylation, and transport) as well as 
cytoplasmic RNA degradation. All these processes 
contribute to the final steady- state level of a particular 
transcript. These points of regulation create a more 
flexible regulatory system than any one process could 
produce alone. For example, a short-lived message is less 
abundant than a stable one, even if it is highly 
transcribed and efficiently processed. The efficient rate 
of synthesis ensures that the message reaches the 
cytoplasm and is translated, but the rapid rate of 
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degradation guarantees that the mRNA does not accumulate 
to too high a level. Many rnas, for example the fQj . 
proto- oncogenes c-msc and c-faa, have been studied which 
exhibit this kind of regulation in that they are expressed 
at very low levels, decay rapidly and are modulated 
quickly and transiently under different conditions. Se^, 
M. Hentze, Biochim. Biophys. Acta lfl90:281-292 (1991) for 
a review. The rate of degradation of many of these mRNAs 
has been shown to be a function of the presence of one or 
more instability/ inhibitory sequences within the mRNA 
10 itself. 

Some cellular genes which encode unstable or 
short-lived mRNAs have been shown to contain A and U-rich 
(AO- rich) INS within the 3' untranslated region (3 ' UTR) 
of the transcript mRNA. These cellular genes include the 
genes encoding granulocyte-monocyte colony stimulating 
factor (GM-CSF), whose AU-rich 3 'UTR sequences (containing 
8 copies of the sequence motif AUUUA) are more highly 
conserved between mice and humans than the protein 
encoding sequences themselves (93% versus 65%) (G. Shaw, 
and R. Kamen, Cell 4£:659-667 (1986)) and the mys proto- 
oncogene <c-5Qrc.) , whose untranslated regions are conserved 
throughout evolution (for example, 81% for man and mouse) 
(M. Cole and S.E. Mango, Enzyme 44:167-180 (1990)). Other 
unstable or short-lived mRNAs which have been shown to 
contain AU-rich sequences within the 3' UTR include 
interferons (alpha, beta and gamma IFNs) , interleukins 
(IL1, IL2 and IL3); tumor necrosis factor (TNF) ; 
lymphotoxin (Lym) ; IgGl induction factor (IgG IF) ; 
granulocyte colony stimulating factor (G-CSF) , proto- 
oncogene (c-oKb) ; and sis proto -oncogene (c-aia) (g. Shaw, 
and R. Kamen, Cell 4£:659-667 (1986)). gge. also, R. 
Wisdom and W. Lee, Gen. &Devel. 5 :232-243 (1991) le-mvd : 
A. Shyu et al., Gen. & Devel. 5:221-231 (1991) (c-ffla) ; T. 
Wilson and R . Treisman, Natuia 12£ : 396-399 (1988) (c-fos) , 
T. Jones and M. Cole, Mol. Cell Biol. 7:4513-4521 (1987) 
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(c-myc) ; v. Kruys et al., Proc. Natl. Acad. Sci. USA. 
89:673-677 (1992) (TNF) ; D. Koeller et al . , Proc. Natl. 
Acad. Sci. USA. M:7778-7782 (1991) (transferrin receptor 
(TfR) and c-foa); i. Laird-Of f ringa et al., Nucleic Acids 
Res. l£:2387-2394 (1991) (c-myc.) ; D. Wreschner and G. 
Rechavi, Eur. J. Biochem. i7_2.: 333 -340 (1988) (which 
contains a survey of genes and relative stabilities) ; 
Bunnell et al.. Somatic Cell and Mol. Genet. i£:151-l62 

(1990) (galactosyltransferase-associated protein (GTA) , 
which contains an AU-rich 3' UTR with regions that are 98% 
similar among humans, mice and rats) ? and Caput et al. 
Proc. Natl. Acad. Sci. 11:1670-1674 (1986) (TNF, which 
contains a 33 nt AU-rich sequence conserved la toto in the 
murine and human TNF mRNAs) . 

Some of these cellular genes which have been 
shown to contain INS within the 3' UTR of their mRNA have 
also been shown to contain INS within the coding region. 
See, e.g., R. Wisdom, and W. Lee, Gen. & Devel. £:232-243 

(1991) (c-soa) ; A. Shyu et al.. Gen. & Devel. £:221-23l 
(1991) (c-fos) . 

Like the cellular mRNAs, a number of HIV-l mRNAs 
have also been shown to contain INS within the protein 
coding regions, which in some cases coincide with areas of 
high AU- content. For example, a 218 nucleotide region 
with high AU content (61.5%) present in the HIV-l gag 
coding sequence and located at the 5' end of the gag gene 
has been implicated in the inhibition of gag expression. 
S. Schwartz et al., J. Virol. 66:150-159 (1992). Further 
experiments have indicated the presence of more than one 
INS in the gag-protease gene region of the viral genome 
30 (see below) . Regions of high AU content have been found 
in the HIV-l gag/pol and env INS regions. The AUUUA 
sequence is not present in the gag coding sequence, but it 
is present in many copies within gag/pol and env coding 
regions. S. Schwartz et al., J. Virol. 66:150-159 (1992). 
Ses algo., e.g., M. Emerman, Cell 57.-1155-1165 (1989) (env 
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gene contains both 3' utr and internal 

inhibitory/instability sequences); C. Rosen, Proc. Natl 
Acad. Sci., USA 31:2071-2075 (1988) (env) ; M. 
Hadzopoulou-Cladaras et al., j. virol. £1:1265-1274 (1989) 
(env); F. Maldarelli et al., J. virol. 11:5732-5743 (1991) 
(gag/pol); A. Cochrane et al., J. virol. £1:5303-5313 
(1991) (pol). f. Maldarelli et al., guara, note that the 
direct analysis of the function of INS regions in the 
context of a replication- content, full-length HIV-l 
provirus is complicated by the fact that the intragenic 
INS are located in the coding sequences of virion 
structural proteins. They further note that changes in 
these intragenic INS sequences would in most cases affect 
protein sequences as well, which in turn could affect the 
replication of such mutants. 

The INS regions are not necessarily AU-rich. 
For example, the c-£aa coding region INS is structurally 
unrelated to the AU-rich 3' UTR INS (A. shyu et al. Gen 
& Devel. 1:221-231 (i9 9 i), and some parts of the env 
coding region, which appear to contain INS elements, are 
not AU-rich. Furthermore, some stable transcripts also 
carry the AUUUA motif in their 3' UTRs, implying either 
that this sequence alone is not sufficient to destabilize 
a transcript, or that these messages also contain a 
dominant stabilizing element (M. Cole and S.E. Mango 
Enzyme 4^=167-180 (1990)). Interestingly, elements unique 
to specific mRNAs have also been found which can stabilize 
a mRNA transcript. One example is the Rev responsive 
element, which in the presence of Rev protein promotes the 
transport, stability and utilization of a mRNA transcript 
(B. Felber et al., Proc. Natl. Acad. Sci. USA 8_£= 1495 -1499 
(1989) ) . 

It is not yet known whether the AU sequences 
themselves, and specifically the Shaw-Kamen sequence, 
AUUUA, act as part or all of the degradation signal. Nor 
is it clear whether this is the only mechanism employed 
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for short-lived messages, or if there are different 
classes of RNAs, each with its own degradative system. 
See, M. Cole and S.E. Mango, Enzyme 44:167-180 (1990) for 
a review; see. also., T. Jones and M. Cole, Mol. Cell. 
Biol. 7:4513-4521 (1987). Mutation of the only copy of 
the AUDUA sequence in the c-my^ rna INS region has no 
effect on RNA turnover, therefore the inhibitory sequence 
may be quite different from that of GM-CSF (M. Cole and 
S.E. Mango, Enzyme 41:167-180 (1990)), or else the mRNA 
instability may be due to the presence of additional INS 
regions within the mRNA. 

Previous workers have made mutations in genes 
encoding AU-rich inhibitory/instability sequences within 
the 3' UTR of their transcript mRNAs. For example, G. 
Shaw and R. Kamen, Cell 4£:659-667 (1986), introduced a 51 
nucleotide AT-rich sequence from GM-CSF into the 3' UTR of 
the rabbit 0-globin gene. This insertion caused the 
otherwise stable 0-globin mRNA to become highly unstable 
in yivc_, resulting in a dramatic decrease in expression of 
0-globin as compared to the wild -type control. The 
introduction of another sequence of the same length, but 
with 14 G's and C's interspersed among the sequence, into 
the same site of the 3 ' UTR of the rabbit jS-globin gene 
resulted in accumulation levels which were similar to that 
of wild-type 0-globin mRNA. This control sequence did not 
25 contain the motif AUUUA, which occurs seven times in the 

AU-rich sequence. The results suggested that the presence 
of the AU-rich sequence in the 0-globin mRNA specifically 
confers instability. 

A. Shyu et al., Gen. & Devel. £:221-231 (1991), 
studied the AU-rich INS in the 3' UTR of c-ffis by 
disrupting all three AUUUA pentanucleo tides by single U- 
to-A point mutations to preserve the AU-richness of the 
element while altering its sequence. This change in the 
sequence of the 3' UTR INS dramatically inhibited the 
ability of the mutated 3 ' UTR to destabilize the 0-globin 
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message when inserted into the 3 ' UTR of a 0-globin mRNA 
as compared to the wild- type INS . The c-£os protein- 
coding region INS [which is structurally unrelated to the 
3' UTR INS) was studied by inserting it in- frame into the 
coding region of a 0- globin and observing the effect of 
deletions on the stability of the heterologous c-foa-fl- 
globin mRNA. 

Previous workers have also made mutations in 
genes encoding inhibitory/ instability sequences within the 
coding region of their transcript mRNAs. For example, P. 
Carter-Muenchau and R . Wolf, Proc. Natl. Acad. Sci., USA, 
M: 113 8 -1142 (1989) demonstrated the presence of a 
negative control region that lies deep in the coding 
sequence of the E_s_ co^L 6-phosphogluconate dehydrogenase 
(gnd) gene. The boundaries of the element were defined by 
the cloning of a synthetic "internal complementary 
sequence" (ICS) and observing the effect of this internal 
complementary element on gene expression when placed at 
several sites within the gnd gene. The effect of single 
and double mutations introduced into the synthetic ICS 
element by site-directed mutagenesis on regulation of 
expression of a gnd-lacZ fusion gene correlated with the 
ability of the respective mRNAs to fold into secondary 
structures that sequester the ribosome binding site. 
Thus, the gnd gene's internal regulatory element appears 
to function as a cis-acting antisense RNA. 

M. Lundigran et al., Proc. Natl. Acad. Sci. USA 
M:1479-1483 (1991), conducted an experiment to identify 
sequences linked to btuB that are important for its proper 
expression and transcriptional regulation in which a DNA 
fragment carrying the region from -60 to +253 (the coding 
region starts at +241) was mutagenized and then fused in 
frame to lacZ. Expression of 0-galactosidase from variant 
plasmids containing a single base change were then 
analyzed. The mutations were all GC to AT transitions, 
as expected from the mutagenesis procedures used. Among 
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other mutations, a single base substitution at +253 
resulted in greatly increased expression of the btuB-lacZ 
gene fusion under both repressing and nonrepressing 
conditions . 

R. Wisdom and W. Lee, Gen. & Devel. 5_: 232 -243 
(1991) , conducted an experiment which showed that mRNA 
derived from a hybrid full length c-myc_ gene, which 
contains a mutation in the translation initiation codon 
from ATG to ATC, is relatively stable, implying that the 
c-wz coding region inhibitory sequence functions in a 
translation dependent manner. 

R. Parker and A. Jacobson, Proc. Natl. Acad. 
Sci. USA 52:2780-2784 (1990) demonstrated that a region of 
42 nucleotides found in the coding region of SacchammvnP, 
cerevisiae MATal mRNA, which normally confers low 
stability, can be experimentally inactivated by 
introduction of a translation stop codon immediately 
upstream of this 42 nucleotide segment. The experiments 
suggest that the decay of MATal mRNA is promoted by the 
translocation of ribosomes through a specific region of 
the coding sequence. This 42 nucleotide segment has a 
high content (8 out of 14) of rare codons {where a rare 
codon is defined by its occurrence fewer than 13 times per 
1000 yeast codons (citing S. Aota et al., Nucl. Acids. 
Res. 4£:r315-r402 (1988))) that may induce slowing of 
25 translation elongation. The authors of the study, R. 

Parker and A. Jacobson, state that the concentration of 
rare codons in the sequences required for rapid decay, 
coupled with the prevalence of rare codons in unstable 
yeast mRNAs and the known ability of rare codons to induce 
translations pausing, suggests a model in which mRNA 
structural changes may be affected by the particular 
positioning of a paused ribosome. Another author stated 
that it would be revealing to find out whether (and how) a 
kinetic change in translation elongation could affect mRNA 
stability (M. Hentze, Bioch. Biophys. Acta 1^:281-292 
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(1991)). R. Parker and A. Jacobson, note, however, that 
the stable PGK1 mRNA can be altered to include up to 40% 
rare codons with, at most, a 3 -fold effect on steady-state 
mRNA level and that this difference may actually be due to 
a change in transcription rates. Thus, these authors 
conclude, it seems unlikely that ribosome pausing per se 
is sufficient to promote rapid mRNA decay. 

None of the aforementioned references describe 
or suggest the present invention of locating 
inhibitory/ instability sequences within the coding region 
of an mRNA and modifying the gene encoding that mRNA to 
remove these inhibitory/ instability sequences by making 
multiple nucleotide substitutions without altering the 
coding capacity of the gene. 

15 III. DISCLOSUR E OF THE INVENTION 

The invention relates to methods of increasing 
the stability and/or utilization of a mRNA produced by a 
gene by mutating regulatory or inhibitory/ instability 
sequences (INS) in the coding region of the gene which 

20 prevent or reduce expression. The invention also relates 
to constructs, including expression vectors, containing 
genes mutated in accordance with these methods and host 
cells containing these constructs. 

As defined herein, an inhibitory/ instability 

25 sequence of a transcript is a regulatory sequence that 
resides within an mRNA transcript and is either (1) 
responsible for rapid turnover of that mRNA and can 
destabilize a second indicator/reporter mRNA when fused to 
that indicator/reporter mRNA, or is (2) responsible for 

30 underutilization of a mRNA and can cause decreased protein 
production from a second indicator/reporter mRNA when 
fused to that second indicator/reporter mRNA or (3) both 
of the above. The inhibitory/instability sequence of a 
gene is the gene sequence that encodes an 

35 inhibitory/instability sequence of a transcript. As used 
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herein, utilization refers to the overall • • 
translation of an mRNA. U efflciei ^ of 

useful for^ * iDVenti ° n m P« t i=ularl y 

TJ1 the st ^ity and/or utilization of 

a mRNA „ lthout changing . ts prote . n cod . 

tr V l" h alt6rnatiVS a, * 0,U ~ 3 ° f the Mention In „ hich 

^ that the amino acid sequence of the e 

changed to xnclude conservative or non- conservative Znl 

« 0 iziT ons ' whiie stii1 retai ^ - e 

part L tT enC ° ded Pr ° teia ' " e ^ -^ic™* as 

part of the invention. 

These methods are useful for allowing or 
xncreasxng the egression of genes which wouid otherwise 

because of the presence of » regions in the mRNA 

t rr d • inVenti ° n PrOVidSS metll0ds ° f -creasing 

encT of a protein encoded by a gene which 

encodes an mRKA containing an inhibitory/instability 
20 ^ ty « ^e nucleotide seance 

of any gene encoding the inhibitory/instability regL 
methods, constructs and host cells of the 

ZT T," S USefUl inCrea8in9 "» ~t of protein 

produced by any g ene which encodes an mRKA transcript 

for example, those coding for growth factors, interferons 
.nterleu^s, and the fos proto- oncogene protein, as we"' 
as the genes coding for HIV-! gag „, env prQteins _ 

The method of the invention is exemplified bv 
the mutational Activation of an XNS withTL coding 
region of the HXV-l gag gene which results in increase! 
Sag expression, and by constructs useful for Rev 
independent gag expression in human cells This 
Rational inactivation of the inhibitory/instability 

» izz:\r°T introducia9 ^ — - 

«to the Atf-nch inhibitory sequences within the coding 
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region of the gag gene which, due to the degeneracy of 
nucleotide coding sequences, do not affect the amino acid 
sequence of the gag protein. 

The constructs of the invention are exemplified 
by vectors containing the gag env, and pol genes which 
have been mutated in accordance with the methods of this 
invention and the host cells are exemplified by human 
HLtat cells containing these vectors. 

The invention also relates to using the 
constructs of the invention in immunotherapy and 
immunoprophylaxis, e.g., as a vaccine, or in genetic 
therapy after expression in humans. Such constructs can 
include or be incorporated into retroviral vectors or 
other expression vectors or they may also be directly 
injected into tissue cells resulting in efficient 
expression of the encoded protein or protein fragment. 
These constructs may also be used for in-vivn or in-vitm 
gene replacement, e.g., by homologous recombination with a 
target gene in-situ. 

The invention also relates to certain 
exemplified constructs which can be used to simply and 
rapidly detect and/or further define the boundaries of 
inhibitory/instability sequences in any mRNA which is 
known or suspected to contain such regions, whether the 
INS are within the coding region or in the 3'UTR or both. 
Once the INS regions of the genes have been located and/or 
further defined through the use of these vectors, the same 
vectors can be used in mutagenesis experiments to 
eliminate the identified INS without affecting the coding 
capacity of the gene, thereby allowing an increase in the 
amount of protein produced from expression vectors 
containing these mutated genes. The invention also 
relates to methods of using these constructs and to host 
cells containing these constructs. 

The constructs of the invention which can be 
used to detect instability/ inhibitory regions within an 
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mRNA are exemplified by the vectors, pi9, pi7M1234, 
P37M1234 and p37Ml-10D, which are set forth in Fig. 1. ( B ) 
and Fig. 6. p37M1234 and p37Ml-10D are the preferred 
constructs, due to the existence of a commercially 
available EL ISA test which allows the simple and rapid 
detection of any changes in the amount of expression of 
the gag indicator/reporter protein. However, any 
constructs which contain the elements depicted between the 
long terminal repeats in the afore -mentioned constructs of 
Fig. 1. (B) and Fig. 6, and which can be used to detect 
instability/ inhibitory regions within a mRNA, are also 
envisioned as part of this invention. 

The existence of inhibitory/ instability 
sequences has been known in the art, but no solution to 
the problem which allowed increased expression of the 
genes encoding the mRNAs containing these sequences within 
coding regions by making multiple nucleotide 
substitutions, without altering the coding capacity of the 
gene, has heretofore been disclosed. 

20 IV * BRISF DESCRIPTION n F the pftAyTf irag 

Fig. 1. (A) Structure of the HXV-1 genome. Boxes indicate 
the different viral genes. (B) Structure of the gag 
expression plasmids (see infra) . Plasmid pl7 contains the 
complete HIV-1 5' LTR and sequences up to the BssHII 
restriction site at nucleotide (nt) 257. (The nucleotide 
numbering refers to the revised nucleotide sequence of the 
HIV-l molecular clone P HXB2 (G. Myers etal., Eds. Human 
retroviruses and A TPS , A comnilatinn , a nalyst nf 

nucleic acia and aminn acid a^^ nap (Los Alamos National 
Laboratory, Los Alamos, New Mexico, 1991), incorporated 
herein by reference) . This sequence is followed by the 
pl7« coding sequence spanning nt 336-731 (represented as 
an open box) immediately followed by a translational stop 
codon and a linker sequence. Adjacent to the linker is 
the HIV-l 3' LTR from nt 8561 to the last nucleotide of 
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the U5 region. Plasmid pi7R contains in addition the 330 
nt Styl fragment encompassing the RRE (l. Solomin et al 
J Virol 64:6010-6017 (1990)) (represented as a stippled ' 
box) 3' to the P 17« coding sequence. The RRE is followed 
by HIV-1 sequences from nt 8021 to the last nucleotide of 
the U5 region of the 3 ' LTR . Plasmids pl9 and p!9R were 
generated by replacing the HIV-i p i 7 ~ coding sequence in 
plasmids pl7 and pl7R, respectively, with the RSV pl9« 
coding sequence (represented as a black box) . Plasmid 
P17M1234 is identical to pl7, except for the presence of 
2 8 silent nucleotide substitutions within the gag coding 
region, indicated by XXX. Wavy lines represent plasmid 
sequences. Plasmid pl7M1234 (731-1424) and plasmid 
P37M1234 are described immediately below and in the 
description. These vectors are illustrative of constructs 
which can be used to determine whether a particular 
nucleotide sequence encodes an INS . In this instance, 
vector P17M1234, which contains an indicator gene (here, 
pl7«) represents the control vector and vectors 
P17M1234 (731-1424) and p37M1234 represent vectors in which 
the nucleotide sequence of interest (here the p24™ coding 
region) is inserted into the vector either 3' to the stop 
codon of the indicator gene or is fused in frame to the 
coding region of the indicator gene, respectively. (C) 
Construction of expression vectors for identification of 
gag INS and for further mutagenesis. P 17M1234 was used as 
a vector to insert additional HIV-1 gag sequences 
downstream from the coding region of the altered pl7« 
gene. Three different fragments indicated by nucleotide 
numbers were inserted into vector pl7M1234 as described 
below. To generate plasmids pl7M1234 (731-1081) , 
P17M1234 (731-1424) and P 17M1234 (731-2165) , the indicated 
fragments were inserted 3' to the stop codon of the pl7« 
coding sequence in pl7M1234. m expression assays (data 
35 not shown), pl7M1234 (731- 1081} and pl7M1234 (731- 1424) 
expressed high levels of pi7« protein. In contrast, 
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p!7M1234 (731-2165) did not express pi7« protein, 
indicating the presence of additional INS within the HIV-i 
gag coding region. To generate plasmids pl7Ml234 (731- 
1081) NS, p37M1234 and p55M1234, the stop codon at the end 
of the altered pi7« gene and all linker sequences in 
P17M1234 were eliminated by oligonucleotide-directed 
mutagenesis and the resulting plasmids restored the gag 
open reading frame as in HIV-1. In expression assays 
(data not shown) p37M1234 expressed high levels of protein 
as determined by western blotting and ELISA assays whereas 
P55M1234 did not express any detectable gag protein. 
Thus, the addition of sequences 3' to the p24 region 
resulted in the elimination of protein expression, 
indicating that nucleotide sequence 1424-2165 contains an 
INS. This experiment demonstrated that p37M1234 is an 
appropriate vector to analyze additional INS. 



Fig. 2. Gag expression from the different vectors. (A) 

HLtat cells were transfected with plasmid pi7, pi7R, or 
P17M1234 in the absence (-) or presence (+) of Rev ( see 

20 iafra) . The transfected cells were analyzed by 

immunoblotting using a human HIV-i patient serum. (B) 
Plasmid pl9 or pl9R was transfected into HLtat cells in 
the absence (-) or presence (+) of Rev. The transfected 
cells were analyzed by immunoblotting using rabbit and 

25 anti-RSV pi9« serum. HIV or RSV proteins served as 

markers in the same gels. The positions of pi7« and P19* 3 * 
are indicated at right. 
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Pig. 3. mKKA analysis on northern blots. (A) HLtat cells 
were transfected with the indicated plasmids in the 
absence (-) or presence (+) of Rev. 20 pg of total RNA 
prepared from the transfected cells were analyzed ( see 
iafra.) . (B) RNA production from plasmid pl9 or pl9R was 
similarly analyzed in the absence (-) or presence U) of 
Rev. 
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Pig. 4. Nucleotide sequence of the HIV-1 P 17« region. 

The locations of the 4 oligonucleotides (M1-M4) used to 
generate all mutants are underlined. The silent 
nucleotide substitutions introduced by each mutagenesis 
oligonucleotide are indicated below the coding sequence. 
5 Numbering starts from nt +i of the viral mRNA. 

Pig. 5. Gag expression by different mutants. HLtat cells 
were transfected with the various plasmids indicated at 
the top of the figure. Plasmid pi7R was transfected in 
the absence (-) or presence ( + ) of Rev, while the other 
plasmids were analyzed in the absence of Rev. p i7« 
production was assayed by immunoblotting as described in 
Fig. 2. 
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Pig. 6. Expression vectors used in the identification 
and elimination of additional INS elements in the gag 
region. The gag and pol region nucleotides included in 
each vector are indicated by lines. The position of some 
gag and pol oligonucleotides is indicated at the top of 
the figure, as are the coding regions for pl7«, p24«, 
P15«, protease and p66^ proteins. Vector P 37M1234 was 
further mutagenized using different combinations of 
oligonucleotides. One obtained mutant gave high levels of 
P24 after expression. It was analyzed by sequencing and 
found to contain four mutant oligonucleotides M6gag, 
M7gag, M8gag and MIOgag. Other mutants containing ' 
different combinations of oligos did not show an increase 
in expression, or only partial increase in expression. 
P55BM1-10 and p55AMl-io were derived from p37Ml-10D. 
P55M1-13P0 contains additional mutations in the gag and 
pol regions included in the oligonucleotides Mllgag, 
M12gag, Ml3gag and MOpol. The hatched boxes indicate the 
location of the mutant oligonucleotides; the hatched boxes 
containing circles indicate mutated regions containing 
ATTTA sequences, which may contribute to instability 
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and/or inhibition of the mRNA; and the open boxes 
containing triangles indicate mutated regions containing 
AATAAA sequences, which may contribute to instability 
and/or inhibition of the mRNA. Typical levels of p24« 
expression in human cells after transf ections as described 
5 supra are shown at the right (in pg/ml) . 

Fig. 7. Eukaxyotic expression plasmids used to study env 
expression. The different expression plasmids are derived 
from pNL15E (Schwartz, et al. J. Virol. 64:5448-5456 

10 (1990) . The generation of the different constructs is 
described in the text. The numbering follows the 
corrected HXB2 sequence (Myers et al., 1991, supra : Ratner 
et al., Hamatol. Bluttransfus . 31:404-406 (1987); Ratner 
et al. f AIDS Res. Hum. Retroviruses 3:57-69 (1987); 

15 Solomin, et al. J. Virol. 64:6010-6017 (1990) , starting 
with the first nucleotide of R as +1. 5'SS, 5' splice 
site; 3'SS, 3' splice site. 
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Fig. 8. Env expression is Rev dependent in the absence of 
functional splice sites. Plasmids pl5ESD- and plSEDSS (C) 
were trans fected in the absence or presence of a rev 
expression plasmid (pL3crev) into HLtat cells. One day 
later, the cells were harvested for analyses of RNA and 
protein. Total RNA was extracted and analyzed on Northern 
25 blots (B) . The blots were hybridized with a 

nick- translated probe spanning XhoI-SacI (nt 8443 to 9118) 
of HXB2. Protein production was measured by western blots 
to detect cell -associated Env using a mixture of HTV-i 
patient sera and rabbit anti-gp!20 antibody (A). 
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35 



Fig. 9. Env production from the gpl20 expression plasmids. 

The indicated plasmids were transfected into HLtat cells 
in duplicate plates. A rev expression plasmid (pL3srev) 
was cotransfected as indicated. One day later, the cells 
were harvested for analyses of RNA and protein. Total RNA 
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was extracted and analyzed on Northern blots (A) . The 
blots were hybridized using a nick- translated probe 
spanning nt 6158 to 7924. Protein production (B) was 
measured by immunoprecipitation after labeling for 5 h 
with 200 mci/ml of »s- cysteine to detect secreted 
processed Env (gpl20) . 

Pig. 10. Th. identification of HIS elements within gpl20 
and gp41 using the pl9 (RSV gag) teat system. Schematic 
structure of exon 5E containing the env ORF. Different 
fragments (A to G) of the gp 4 i portion and fragment H of 
the vpu/gpi20 portion were PCR amplified and inserted into 
the unique EcoRl site located downstream of the RSV gag 
gene in pi9 . The location of the sequences included in 
the amplified fragments is indicated to the right using 
HXB2R numbering system. Fragments A and B are amplified 
from pNLlSE and pNLlSEDSS (in which the splice acceptor 
sites 7A, 7B and 7 have been deleted) respectively, using 
the same oligonucleotide primers. They are 276 and 234 
nucleotides long, respectively. Fragment C was amplified 
from pNLlSEDSS as a 323 nucleotide fragment. Fragment F 
is a Hpal-Kpnl restriction fragment of 362 nucleotides. 
Fragment E was amplified as a 668 nucleotide fragment from 
PNLlSEDSS, therefore the major splice donor at nucleotide 
5S92 of HXB2 has been deleted. The rest of the fragments 
were amplified from pNLlSE as indicated in the figure. 
HLtat cells were transfected with these constructs. One 
day later, the cells were harvested and piggag production 
was determined by Western blot analysis using the 
anti-RSVGag antibody. The expression of Gag from these 
plasmids was compared to Gag production of pis. SA, splice 
acceptor; B, BamHI; H, Hpal; x, Xhol; K, KpnI. The down 
regulatory effect of INS contained within the different 
fragments is indicated at right. 

Fig. 11. The identification of TBS elements within gpl20 
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and gp41 using the P 37M1-10D (mutant INS p37« expression 
system) test system. Schematic structure of the env ORF. 

Different fragments (1 to 7) of env were PCR amplified as 
indicated in the figure and inserted into the polylinker 
located downstream of the p37 mutant gag gene in 
P37M1-10D. Fragments 1 to 6 were amplified from the 
molecular clone pLW2.4, a gift of Dr. M. Reitz, which is 
very similar to HXB2R. Clone pLW2.4 was derived from an 
individual infected by the same HIV-1 strain IIIB, from 
which the HXB2R molecular clone has been derived. 
Fragment 7 was cloned from pNL43 . For consistency and 
clarity, the numbering follows the HXB2R system. HLtat 
cells were transfected with these constructs. One day 
later, the cells were harvested and p24« production was 
determined by antigen capture assay. The expression of 
Gag from these plasmids was compared to Gag production of 
P37M1-10D. The down regulatory effect of each fragment is 
indicated at right. 

Pig- 12. Elimination of the negative effects of CRS in 
the pol region. Nucleotides 3700-4194 of HIV-1 were 
inserted in vector P 37M1234 as indicated. This resulted 
in the inhibition of gag expression. Using mutant 
oligonucleotides M9pol-M12pol (P9-P12) , several mutated 
CRS clones were isolated and characterized. One of them, 
p37TCL234RCRSP10+Pl2p contains the mutations indicated in 
Fig. 13. This clone produced high levels of gag. 
Therefore, the combination of mutations in 
p37M1234RCRSPl0-hP12p eliminated the INS, while mutations 
only in the region of P10 or of P12 did not eliminate the 
INS. 

Pig. 13. Point mutations eliminating the negative effects 
of CRS in the pol region (nucleotides 3700-4194) . The 

combination of mutations able to completely inactivate the 
inhibitory/instability element within the CRS region of 
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HIV-l pol (nucleotides 3700-4194) is shown under the 
sequence in small letters. These mutations are contained 
within oligonucleotides MIOpol and M12pol (see Table 2) 
Ml2pol oligonucleotide contains additional mutations that 
were not introduced into p37M1234RCRSP10 + P12p (see Pig. 
12 ) , as determined by DNA sequencing . 

Pig. 14. Plasmid map and nucleotide sequence of the 
efficient gag expression vector P 37M1-10D. (A) Plasmid 
map of vector P 37K1-10D. The plasmid contains a 
pBluescriptKS(-) backbone, human genomic sequences 
flanking the HIV-l sequences as found in pNL43 genomic 
clone, HIV-l LTRs and the p37« region (pi 7 and P 24) . The 
P17 region has been mutagenized using oligonucleotides Ml 
to M4, and the p24 region has been mutagenized using 
oligonucleotides M6, M7, M8 and M10, as described in the 
test. The coding region for p3 7 is flanked by the 5' and 
3 HIV-l LTRs , which provide promoter and polyadenylation 
signals, as indicated by the arrows. Three consecutive 
arrows indicate the US, R, and U3 regions of the LTR 
respectively. The transcribed portions of the LTRs are 
shown in black.. The translational stop codon inserted at 
the end of the P 24 coding region is indicated at position 
181B. some restriction endonuclease cleavage sites are 
^ also indicated. (B-D) Complete nucleotide sequence of 
P37K1-10D. The amino acid sequence of the p37« protein 
is shown under the coding region. Symbols are as above 
Numbering starts at the first nucleotide of the 5' LTR. 

V - WOTBS FOR TARRYIWP. (yrv r the jmm^yr^ 

It is to be understood that both the foregoing 
general description and the following detailed description 
are exemplary and explanatory only, and are not 
restrictive of the invention, as claimed. The 
accompanying drawings, which are incorporated in and 
constitute a part of the specification, illustrate an 
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embodiment of the invention and, together with the 
description, serve to explain the principles of the 
invention. * 

The invention comprises methods for eliminating 
intragenic inhibitory/instability regions of an mRNA by 
(a) identifying the intragenic inhibitory/ instability 
regions, and (b) mutating the intragenic 
inhibitory/instability regions by making multiple point 
mutations. These mutations may be clustered. This method 
does not require the identification of the exact location 
or knowledge of the mechanism of function of the INS. 
Nonetheless, the results set forth herein allow the 
conclusion that multiple regions within mRNAs participate 
in determining stability and utilization and that many of 
these elements act at the level of RNA transport, 
15 turnover, and/or localization. Generally, the mutations 
are such that the amino acid sequence encoded by the mRNA 
is unchanged, although conservative and non- conservative 
amino acid substitutions are also envisioned as part of 
the invention where the protein encoded by the mutated 
gene is substantially similar to the protein encoded by 
the non-mutated gene. 

The nucleotides to be altered can be chosen 
randomly, the only requirement being that the amino acid 
sequence encoded by the protein remain unchanged; or, if 
conservative and non- conservative amino acid substitutions 
are to be made, the only requirement is that the protein 
encoded by the mutated gene be substantially similar to 
the protein encoded by the non-mutated gene. 

If the INS region is AT rich or GC rich, it is 
preferable that it be altered so that it has a content of 
about 50% G and c and about 50% A and T. If the INS 
region contains less -preferred codons, it is preferable 
that those be altered to more -preferred codons. If 
desired, however (e.g., to make an A and T rich region 
more G and C rich) , more -preferred codons can be altered 
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to leas -preferred codons . If the INS region contains 
conserved nucleotides, some of those conserved nucleotides 
could be altered to non- conserved nucleotides. Again the 
only requirement is that the amino acid sequence encoded 
by the protein remain unchanged; or, if conservative and 
non- conservative amino acid substitutions are to be made, 
the only requirement is that the protein encoded by the 
mutated gene be substantially similar to the protein 
encoded by the non-mutated gene. 

As used herein, conserved nucleotides means 
evolutionarily conserved nucleotides for a given gene, 
since this conservation may reflect the fact that they are 
part of a signal involved in the inhibitory/instability 
determination. Conserved nucleotides can generally be 
determined from published references about the gene of 
interest or can be determined by using a variety of 
computer programs available to practitioners of the art. 

Less-preferred and more-preferred codons for 
various organisms can be determined from codon usage 
charts, such as those set forth in T. Maruyama et al., 
Nucl. Acids Res. i4:rlSl-rl97 (1966) and in S. Acta et 
al., Nucl. Acids. Res. I£:r315-r402 U988) , or through use 
of a computer program, such as that disclosed in U.S. 
Patent No. 5,082,767 entitled "Codon Pair Utilization", 
issued to G. W. Hatfield et al. on January 21, 1992, which 
is incorporated herein by reference. 

Generally, the method of the invention is 
carried out as follows: 
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1- Msntlfl ration Of M mRNtt re t aining an Tire 
The rate at which a particular protein is made 
is usually proportional to the cytoplasmic level of the 
mRNA which encodes it. Thus, a candidate for an mRNA 
containing an inhibitory/instability sequence is one whose 
mRNA or protein is either not detectably expressed or is 
expressed poorly as compared to the level of expression of 
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a reference mRNA or protein under the control of the same 
or similar strength promoter. Differences in the steady 
state levels of a particular mRNA (as determined, for 
example, by Northern blotting) , when compared to the 
steady state level of mRNA from another gene under the 
control of the same or similar strength promoter, which 
cannot be accounted for by changes in the apparent rate of 
transcription (as determined, for example, by nuclear run- 
on assays) indicate that the gene is a candidate for an 
unstable mRNA. m addition or as an alternative to being 
unstable, cytoplasmic mRNAs may be poorly utilized due to 
various inhibitory mechanisms acting in the cytoplasm. 
These effects may be mediated by specific mRNA sequences 
which are named herein as "inhibitory sequences". 

Candidate mRNAs containing 
inhibitory/instability regions include mRNAs from genes 
whose expression is tightly regulated, e.g., many 
oncogenes, growth factor genes and genes for biological 
response modifiers such as interleuJcins. Many of these 
genes are expressed at very low levels, decay rapidly and 
are modulated quickly and transiently under different 
conditions. The negative regulation of expression at the 
level of mRNA stability and utilization has been 
documented in several cases and has been proposed to be 
occurring in many other cases . Examples of genes for 
which there is evidence for post -transcriptional 
regulation due to the presence of inhibitory/instability 
regions in the mRNA include the cellular genes encoding 
granulocyte -monocyte colony stimulating factor (GM-CSF) , 
proto- oncogenes c-myc., c-sgfe, c-aia, c-faa; interferons 
(alpha, beta and gamma IFNs) ; interleuJcins (ILl, IL2 and 
IL3); tumor necrosis factor (TOP); lymphotoxin (Lym) ; IgGl 
induction factor (IgG IF) ; granulocyte colony stimulating 
factor (G-CSF) ,* transferrin receptor (TfR) ; and 
galactosyltransferase-associated protein (GTA) ; HIV-1 
genes encoding env, gag and pol; the E,. coli genes for 6- 
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phosphogluconate dehydrogenase (gnd) and btuB; and the 
yeast gene for MATai (a ee the discussion in the 
"Background Art" section, above) . The genes encoding the 
cellular proto-oncogenes c-myc and c-fss, as well as the 
yeast gene for MATal and the HIV-l genes for gag, env and 
pol are genes for which there is evidence for 
inhibitory/instability regions within the coding region in 
addition to evidence for inhibitory/ instability regions 
within the non- coding region. Genes encoding or suspected 
of encoding mRNAs containing inhibitory/ instability 
regions within the coding region are particularly relevant 
to the invention. 

After identifying a candidate unstable or poorly 
utilized mRNA, the in viya half- life (or stability) of 
that mRNA can be studied by conducting pulse- chase 
experiments (i.e., labeling newly synthesized RNAs with a 
radioactive precursor and monitoring the decay of the 
radiolabeled mRNA in the absence of label); or by 
introducing in vj^e transcribed mRNA into target cells 
(either by microinjection, calcium phosphate co- 
precipitation, electroporation, or other methods known in 
the art) to monitor the is yiyo. half-life of the defined 
mRNA population; or by expressing the mRNA under study 
from a promoter which can be induced and which shuts off 
transcription soon after induction, and estimating the 
half- life of the mRNA which was synthesized during this 
short transcriptional burst; or by blocking transcription 
pharmacologically (e.g., with Actinomycin D) and following 
the decay of the particular mRNA at various time points 
after the addition of the drug by Northern blotting or RNA 
protection (e.g. SI nuclease) assays. Methods for all the 
above determinations are well established. Ses, e.g., 
M.w. Hentze et al., Biochim. Biophys. Acta l£2ii'-28i-292 
(1991) and references cited therein. Sfi£ also . 
S. Schwartz et al., J. Virol. j£:150-159 (1992). The most 
useful measurement is how much protein is produced. 
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because this includes all possible INS mechanisms. 
Examples of various mRNAs which have been shown to contain 
or which are suspected to contain INS regions are 
described above. Some of these mRNAs have been shown to 
have half-lives of less than 30 minutes when their mRNA 
levels are measured by Northern blots (see, e.g., D. 
Wreschner and G. Rechavi, Eur. J. Biochem. 17^:333-340 
(1988) ) . 
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2 - localization of inability n^ o rminant ^ 
When an unstable or poorly utilized mRNA has 
been identified, the next step is to search for the 
responsible (c^s-acting) RNA sequence elements. Detailed 
methods for localizing the cla- acting 

inhibitory/instability regions are set forth in each of 
the references described in the "Background Art" section, 
above, and are also discussed infra . The exemplified 
constructs of the present invention can also be used to 
localize INS (see below) . Qiz acting sequences 
responsible for specific mRNA turnover can be identified 
by deletion and point mutagenesis as well as by the 
occasional identification of naturally occurring mutants 
with an altered mRNA stability. 

In short, to evaluate whether putative 
regulatory sequences are sufficient to confer mRNA 
stability control, DNA sequences coding for the suspected 
INS regions are fused to an indicator (or reporter) gene 
to create a gene coding for a hybrid mRNA. The DNA 
sequences fused to the indicator (or reporter) gene can be 
cDNA, genomic DNA or synthesized DNA. Examples of 
indicator (or reporter) genes that are described in the 
references set forth in the "Background Art" section 
include the genes for neomycin, 0-galactosidase, 
chloramphenicol actetyltransf erase (CAT), and luciferase, 
as well as the genes for 0-globin, PGK1 and ACT1. £se. 
alsja San *>rook etal., Molecular Pi nning. A t^h^^ 
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MKaal. 2d. ed. Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, (1989) , pp . i 6 . 56 - 16 . 67 . Other genes 
which can be used as indicator genes are disclosed herein 
(i.e., the gag gene of the Rous Sarcoma Virus (which lacks 
an inhibitory/instability region) and the Rev independent 
HIV-l gag genes of constructs pl7M1234, p3 7M1234 and 
P37M1-10D, which have been mutated to inactivate the 
inhibitory/instability region and which constitute one 
aspect of the invention. In general, virtually any gene 
encoding a mRNA which is stable or which is expressed at 
relatively high levels (defined here as being stable 
enough or expressed at high enough level so that any 
decrease in the level of the mRNA or expressed protein can 
be detected by standard methods) can be used as an 
indicator or reporter gene, although the constructs 
P37M1234 and P37M1-10D, which are exemplified herein, are 
preferred for reasons set forth below. Preferred methods 
of creating hybrid genes using these constructs and 
testing the expression of mRNA and protein from these 
constructs are also set forth below. 

In general, the stability and/or utilization of 
the mRNAs generated by the indicator gene and the hybrid 
genes consisting of the indicator gene fused to the 
sequences suspected of encoding an INS region are tested 
by transfecting the hybrid genes into host cells which are 
appropriate for the expression vector used to clone and 
express the mRNAs. The resulting levels of mRNA are 
determined by standard methods of determining mRNA 
stability, e.g. Northern blots, si mapping or PGR methods, 
and the resulting levels of protein produced are 
quantitated by protein measuring assays, such as ELISA, 
immunoprecipitation and/or western blots. The 
inhibitory/instability region (or regions, if there are 
more than one) will be identified by a decrease in the 
protein expression and/or stability of the hybrid mRNA as 
compared to the control indicator mRNA. Note that if the 
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ultimate goal is to increase production of the encoded 
protein, the identification of the INS is most preferably 
carried out in the same host cell as will be used for the 
production of the protein. 

Examples of some of the host cells that have 
been used to detect INS sequences include somatic 
mammalian cells, Xenopus oocytes, yeast and E. coli. See, 
e.g., G. Shaw and R. Kamen, Cell 4£:659-667 (1986) 
(discussed suetjl) which localized instability sequences in 
GM-CSP by inserting putative inhibitory sequences into the 
3 ' UTR of the 0-globin gene, causing the otherwise stable 
0-globin mRNA to become unstable when transfected into 
mouse or human cells. See. also I. Laird - Of fringa et al., 
Nucleic Acids Res. 12:2387-2394 (1991) which localized 
inhibitory/ instability sequences in c-myc using hybrid c- 
my£-neomycin resistance genes introduced into rat 
fibroblasts, and M. Lundigran et al., Proc. Natl. Acad. 
Sci. USA 1479 -1483 (1991) which localized 

inhibitory/instability sequences in btuB gene by using 
hybrid btuB-lacZ genes introduced into £^ coli . For 
examples of reported localization of specific 
inhibitory/instability sequences within a transcript of 
HIV-i by destabilization of an otherwise long-lived 
indicator transcript, sj|e_, e.g., M. Emerman, Cell £7:1155- 
1165 (1989) (replaced 3 ' UTR of env gene with part of HBV 
and introduced into COS-1 cells); S. Schwartz et al., J. 
Virol. ££:150- 159 (1992) (gag gene fusions with Rev 
independent tat reporter gene introduced into HeLa cells); 
F. Maldarelli et al., J. Virol. ££:5732-5743 (1991) 
(gag/pol gene fusions with Rev independent tat reporter 
gene or chloramphenicol acetyltransf erase (CAT) gene 
introduced into HeLa and SW480 cells) ; and A. Cochrane et 
al., J. Virol. £5:5303-5313 (1991) (pol gene fusions with 
CAT gene or rat proinsulin gene introduced into COS-1 and 
CHO cells) . 

It is anticipated that in. vitro mRNA degradation 
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systems (e.g., crude cytoplasmic extracts) to assay mRNA 
turnover is vitro will complement ongoing in vivo analyses 
and help to circumvent some of the limitations of the in 
vivo systems. Seji M.W. Hentze et al . , Biochim. Biophys . 
Acta 1020:281-292 (1991) and references cited therein. 
Sse skiaa D. Wreschner and G. Rechavi, Eur. J. Biochem. 
172:333-340 (1988), which analyzed exogenous mRNA 
stability in a reticulocyte lysate cell-free system. 

In the method of the invention, the whole gene 
of interest may be fused to an indicator or reporter gene 
and tested for its effect on the resulting hybrid mRNA in 
order to determine whether that gene contains an 
inhibitory/instability region or regions. To further 
localize the INS within the gene of interest, fragments of 
the gene of interest may be prepared by sequentially 
deleting sequences from the gene of interest from either 
the 5' or 3' ends or both. The gene of interest may also 
be separated into overlapping fragments by methods known 
in the art (e.g., with restriction endonucl eases, etc.) 

e.g., S. Schwartz et al., J. Virol. £6:150-159 
(1992) . Preferably, the gene is separated into 
overlapping fragments about 300 to 2000 nucleotides in 
length. Two types of vector constructs can be made. To 
permit the detection of inhibitory/instability regions 
that do not need to be translated in order to function, 
vectors can be constructed in which the gene of interest 
(or its fragments or suspected INS) can be inserted into 
the 3' UTR downstream from the stop codon of an indicator 
or reporter gene. This does not permit translation 
through the INS. To test the possibility that some 
inhibitory/instability sequences may act only after 
translation of the mRNA, vectors can be constructed in 
which the gene of interest (or its fragments or suspected 
INS) is inserted into the coding region of the 
indicator/ reporter gene. This method will permit the 
detection of inhibitory/ instability regions that do need 
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co be translated in order to function. The hybrid 
constructs are transfected into host cells, and the 
resulting mRNA le vels are determined by standard methods 
of determining mRNA stability, e.g. Northern blots, SI 
mapping or PGR methods, as set forth above and as 
described in most of the references cited in the 
"Background Art" section, s^ alss, Sambrook et al 
11989), sufija, for experimental methods. The protein 
produced from such genes is also easily quantitated by 
exxsting assays, such as ELISAS, immunoprecipitation and 
western blots, which are also described in most of the 
references cited in the 'Background Art" section See 
aiaa, Sambrook et al. <i 989) , for mriamaaa 

methods. The hybrid DNAs containing the 
inhibitory/instability region (or regions, if there are 
more than one) will be identified by a decrease in the 
protein expression and/or stability of the hybrid mRNA as 
compared to the control indicator mRNA. The use of 
various fragments of the gene permits the identification 
of multiple independently functional 

inhibitory/instability regions, if any, while the use of 
overlapping fragments lessen the possibility that an 
inhibitory/instability region will not be identified as a 
result of its being cut in half , for example. 

The exemplified test vectors set forth in Fig 
1- (B) and Fig. 6 and described herein, e.g., vectors 
P17M1234, P37M1234, P37M1-10D and p 19 , can be used to 
assay for the presence and location of INS in various 
RNAs, including INS which are located within coding 
regions. These vectors can also be used to determine 
whether a gene of interest not yet characterized has INS 
which are candidates for mutagenesis curing. These 
vectors nave a particular advantage over the prior art in 
that the same vectors can be used in the mutagenesis step 
of the invention (described below, in which the identified 
13 eliffli ^<i without affecting the coding capacity of 
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the gene. 

The method of using these vectors involves 
introducing the entire gene, entire cDNA or fragments of 
the gene ranging from approximately 300 nucleotides to 
approximately 2 kilobases 3' to the coding region for gag 
protein using unique restriction sites which are 
engineered into the vectors. The expression of the gag 
gene in HLtat cells is measured at both the RNA and 
protein levels, and compared to the expression of the 
starting vectors, a decrease in expression indicates the 
presence of INS candidates that may be cured by 
mutagenesis . The method of using the vectors exemplified 
m Fig. i herein involves introducing the entire gene and 
fragments of the gene of interest into vectors pl 7Mi234 
P37M1234 and P 19. The size of the fragments are 
preferably 300-2000 nucleotides long. Plasmid DNA is 
prepared in fi. coJi and purified by the CsCl method. 

To permit detection of inhibitory/instability 
regions which do not need to be translated in order to 
function, the entire gene and fragments of the gene of 
interest are introduced into vectors p!7M1234, P 37M1234 or 
P19 3' of the stop codon of the pl7™ coding region To 
allow the detection of inhibitory/instability regions that 
affect expression only when translated, the described 
vectors can be manipulated so that the coding region of 
the entire gene or fragments of the gene of interest are 
fused in frame to the expressed gag protein gene. For 
example, a fragment containing all or part of the coding 
region of the gene of interest can be inserted exactly 3< 
to the termination codon of the gag coding sequence in 
vector P37M1234 and the termination codon of gag and the 
linker sequences can be removed by oligonucleotide 
mutagenesis in such a way as to fuse the gag reading frame 
to the reading frame of the gene of interest. 

RNA and protein production from the two 
expression vectors (e.g. p37M1234 containing the fragment 



WO 93/20212 



PCT/US93/02908 



- 30 - 



10 



15 



20 



25 



30 



35 



of the gene of interest inserted directly 3' of the stop 
codon of the gag coding region, with the gag termination 
codon intact, and P 37M1234 containing the fragment of the 
gene of interest inserted in frame with the gag coding 
region, with the gag termination codon deleted) are then 
compared after transfection of purified DNA into HLtat 
cells.. 

The expression of these vectors after 
transfection into human cells is monitored at both the 
level of RNA and protein production, rna levels are 
quant itated by, e.g., Northern blots, SI mapping or PCR 
methods. Protein levels are quantitated by, e.g., western 
blot or ELISA methods. p37M1234 and P 37M1-10D are ideal 
for quantitative analysis because a fast non- radioactive 
ELISA protocol can be used to detect gag protein (DUPONT 
or COULTER gag antigen capture assay) . A decrease in the 
level of expression of the gag antigen indicates the 
presence of inhibitory /instability regions within the 
cloned gene or fragment of the gene of interest. 

After the inhibitory/instability regions have 
been identified, the vectors containing the appropriate 
INS fragments can be used. to prepare single -stranded DNA 
and then used in mutagenesis experiments with specific 
chemically synthesized oligonucleotides in the clustered 
mutagenesis protocol described below. 

3. Mutation of the Inhibitory/instability 
Regions to Genprah g Stably mPjj^ 

Once the inhibitory/ instability sequences are 
located within the coding region of an mRNA, the gene is 
modified to remove these inhibitory/ instability sequences 
without altering the coding capacity of the gene. 
Alternatively, the gene is modified to remove the 
inhibitory/instability sequences, simultaneously altering 
the coding capacity of the gene to encode either 
conservative or non- conservative amino acid substitutions. 
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In the method of the invention, the most general 
method of eliminating the INS in the coding region of the 
gene of interest is by making multiple mutations in the 
INS region of the gene or gene fragments, without changing 
the ammo acid sequence of the protein encoded by the 
gene; or, if conservative and non- conservative amino acid 
substitutions are to be made, the only retirement is that 
the protein encoded by the mutated gene be substantially 
sinalar to the protein encoded by the non-mutated gene. 
It is preferred that all of the suspected 
inhibitory/ instability regions, if more chan one> be 
mutated at once. Later, if desired, each 
inhibitory/instability region can be mutated separately in 
order to determine the smallest region of the gene that 
needs to be mutated in order to generate a stable mRNA. 
The ability to mutagenize long DMA regions at the same 
time can decrease the time and effort needed to produce 
the desired stable and/or highly expressed mRNA and 
resulting protein. The altered gene or gene fragments 
containing these mutations will then be tested in the 
usual manner, as described above, e.g., by fusing the 
altered gene or gene fragment with a reporter or indicator 
gene and analyzing the level of mRNA and protein produced 
by the altered genes after transfection into an 
appropriate host cell . if the level of mRNA and protein 
produced by the hybrid gene containing the altered gene or 
gene fragment is about the same as that produced by the 
control construct encoding only the indicator gene, then 
the inhibitory/instability regions have been effectively 
eliminated from the gene or gene fragment due to the 
alterations made in the INS. 

In the method of the invention, more than two 
point mutations will be made in the INS region. 
Optionally, point mutations may be made in at least about 
10% of the nucleotides in the inhibitory/instability 
region. These point mutations may also be clustered. The 
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nucleotides to be altered can be chosen randomly (i.e., 
not chosen because of AT or GC content or the presence 'or 
absence of rare or preferred codons) , the only requirement 
being that the amino acid sequence encoded by the protein 
remain unchanged; or, if conservative and non- conservative 
amino acid substitutions are to be made, the only 
requirement is that the protein encoded by the mutated 
gene be substantially similar to the protein encoded by 
the non-mutated gene. 

In the method of the present invention, the gene 
sequence can be mutated so that the encoded protein 
remains the same due to the fact that the genetic code is 
degenerate, i.e., many of the amino acids may be encoded 
by more than one codon. The base code for serine, for 
example, is six- way degenerate such that the codons TCT, 
TCG, TCC, TCA, AGT, and AGC all code for serine. 
Similarly, threonine is encoded by any one of codons ACT, 
ACA, ACC and ACG. Thus, a plurality of different DNA 
sequences can be used to code for a particular set of 
amino acids. The codons encoding the other amino acids ' 
are TTT and TTC for phenylalanine; TTA, TTG, CTT, CTC, CTA 
and CTG for leucine; ATT, ATC and ATA for isoleucine; ATG 
for methione; GTT, GTC, GTA and GTG for valine; CCT, CCC, 
CCA and CCG for proline; GOT, GCC, GCA and GCG for 
alanine; TAT and TAC for tyrosine; CAT and CAC for 
histidine; CAA and CAG for glutamine; AAT and AAC for 
asparagine; AAA and AAG for lysine; GAT and GAC for 
aspartic acid; GAA and GAG for glutamic acid; TGT and TGC 
for cysteine; TGG for tryptophan; CGT, CGC, CGA and CGG 
for arginine; and GGU, GGC, GGA and GGG for glycine. 
Charts depicting the codons [i.e., the genetic code) can 
be found in various general biology or biochemistry 
textbooks . 

In the method of the present invention, if the 
portion (s) of the gene encoding the inhibitory/instability 
regions are AT- rich, it is preferred, but not believed to 
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be necessary, that most or all of the mutations in the 
inhibitory/instability region be the replacement of a and 
T with G and C nucleotides, making the regions more GC- 
nch, while still maintaining the coding capacity of the 
gene. If the portion (a) of the gene encoding the 
inhibitory/instability regions are GC-rich, it is 
preferred, but not believed to be necessary, that most or 
all of the mutations in the inhibitory/ instability region 
be the replacement of G and C nucleotides with A and T 
nucleotides, making the regions less GC-rich, while still 
maintaining the coding capacity of the gene, if the INS 
region is either AT-rich or GC-rich, it is most preferred 
that it be altered so that it has a content of about 50% G 
and c and about 50* A and T. The AT- (or AU-) content 
(or, alternatively, the GC-content) of an 

inhibitory/instability region or regions can be calculated 
by using a computer program designed to make such 
calculations. Examples of such programs, used to 
determine the AT- richness of the HIV-1 gag 
inhibitory/instability regions exemplified herein, are the 
GCG Analysis Package for the VAX (University of Wisconsin) 
and the Gene Works Package ( Intelligenetics) . 

In the method of the invention, if the INS 
region contains less -preferred codons, it is preferable 
that those be altered to more -preferred codons. if 
desired, however (e.g., to make an AT-rich region more GC- 
nch) , more-preferred codons can be altered to less- 
preferred codons. it is also preferred, but not believed 
to be necessary, that less -preferred or rarely used codons 
be replaced with more-preferred codons. Optionally, only 
the most rarely used codons (identified from published 
codon usage tables, such as in T. Maruyama et al., Nucl 
Acids Res. 14(Supp) :r i5l-i97 (1986)) can be replaced with 
preferred codons, or alternatively, most or all of the 
rare codons can be replaced with preferred codons 
Generally, the choice of preferred codons to use will 
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depend on the codon usage of the host cell in which the 
altered gene is to be expressed. Note, however, that the 
substitution of more-preferred codons with less -preferred 
codons is also functional, as shown in the example below. 

As noted above, coding sequences are chosen on 
the basis of the genetic code and, preferably on the 
preferred codon usage in the host cell or organism in 
which the mutated gene of this invention is to be 
expressed. In a number of cases the preferred codon usage 
of a particular host or expression system can be 
ascertained from available references ( see , e.g., T. 
Maruyama et al., Nucl. Acids Res. 14(Supp) :rl51-197 
(1986)), or can be ascertained by other methods ( see , 
e.g., U.S. Patent No. 5,082,757 entitled "Codon Pair 
Utilization", issued to G. W. Hatfield et al. on January 
21, 1992, which is incorporated herein by reference). 
Preferably, sequences will be chosen to optimize 
transcription and translation as well as mRNA stability so 
as to ultimately increase the amount of protein produced. 
Selection of codons is thus, for example, guided by the 
preferred use of codons by the host cell and/or the need 
to provide for desired restriction endonuclease sites and 
could also be guided by a desire to avoid potential 
secondary structure constraints in the encoded mRNA 
transcript. Potential secondary structure constraints can 
be identified by the use of computer programs such as the 
one described in M. Zucker et al., Nucl. Acids Res. £:133 
(1981) . More than one coding sequence may be chosen in 
situations where the codon preference is unknown or 
ambiguous for optimum codon usage in the chosen host cell 
or organism. However, any correct set of codons would 
encode the desired protein, even if translated with less 
than optimum efficiency. 

In the method of the invention, if the INS 
region contains conserved nucleotides, it is also 
preferred, but not believed to be necessary, that 
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conserved nucleotides sequences in ehe 

inhibitory/instability region be mutated. Optionally at 
least approximately 75% of the mutations made in the 
inhibitory/instability region may involve the mutation of 
conserved nucleotides. Conserved nucleotides can be 
determined by using a variety of computer programs 
available to practitioners of the art. 

In the method of the invention, it is also 
anticipated that inhibitory/instability sequences can be 
imitated such that the encoded amino acids are changed to 
contain one or more conservative or non- conservative amino 
acids yet still provide for a functionally equivalent 
protein. For example, one or more amino acid residues 
within the sequence can be substituted by another amino 
acid of a similar polarity which acts as a functional 
equivalent, resulting in a neutral substitution in the 
amino acid sequence. Substitutes for an amino acid within 
the sequence may be selected from other members of the 
class to which the amino acid belongs. For example, the 
nonpolar (hydrophobic) amino acids include alanine, 
leucine, isoleucine, valine, proline, phenylalanine, 
tryptophan and methionine. The polar neutral amino acids 
include glycine, serine, threonine, cysteine, tyrosine 
asparagine, and glutamine. The positively charged (basic) 
ammo acids include arginine, lysine and histidine. The 
negatively charged (acidic) amino acids include aspartic 
acid and glutamic acid. 

In the exemplified method of the present- 
invention, all of the regions in the HIV-i gag gene 
suspected to have inhibitory/instability activity were 
first mutated at once over a region approximately 270 
nucleotides in length using clustered site- directed 
mutagenesis with four different oligonucleotides spanning 
a region of approximately 300 nucleotides to generate the 
construct pl7M1234, described iaUa. which encodes a 
stable mRNA. 
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The four oligonucleotides, which are depicted in 
Fig. 4, are 

Ml : cca 99gggaaagaagaagtacaagctaaagcacatcgtatgggcaagcagg 
(SEQ ID NO: 6) ; M2 : 

ccttcagacaggatcagaggagcttcgatcactatacaacacagtagc (SEQ ID 
5 NO: 7) ; M3 : 

accctctattgtgtgcaccagcggatcgagatcaaggacaccaaggaagc (SEQ ID 
NO: 8 ) ; and M4 : 

gagcaaaacaagtccaagaagaaggcccagcaggcagcagctgacacagg (SEQ ID 
NO: 9). These oligonucleotides are 51 (Ml), 48 (M2) , 50 
10 (M3) and 50 (M4) nucleotides in length. Each 

oligonucleotide introduced several point mutations over an 
area of 19-22 nucleotides (sej* infra ) . The number of 
nucleotides 5' to the first mutated nucleotide were 14 
(Ml); 18 (M2); 17 (M3); and 11 <M4) ; and the number of 
nucleotides 3' to the last mutated nucleotide were 15 
(Ml); 8 (M2); 14 (M3) ; and 17 (M4) . The ratios of AT to 
GC nucleotides present in each of these regions before 
mutation was 33AT/18GC (Ml) ; 30AT/18GC (M2) ; 29AT/21GC 
(M3) and 27AT/23GC (M4) . The ratios of AT to GC 
nucleotides present in each of these regions after 
mutation was 25AT/26GC (Ml); 24AT/24GC (M2) ; 23AT/27GC 
(M3) and 22AT/28GC (M4) . A total of 26 codons were 
changed. The number of times the codon appears in human 
genes per 1000 codons (from T. Maruyama et al. r Nuc. Acids 
Res. 11 (Supp.) :r i51-ri97 (1986)) is listed in parentheses 
next to the codon. In the example, 8 codons encoding 
lysine (Lys) were changed from aaa (22.0) to aag (35.8); 
two codons encoding tyrosine (Tyr) were changed from tat 
(12.4) to tac (18.4); two codons encoding leucine (Leu) 
were changed from tta (5.9) to eta (6.1); two codons 
encoding histidine (His) were changed from cat (9.8) to 
cac (14.3); three codons encoding isoleucine (lie) were 
changed from ata (5.1) to ate (24.0); two codons encoding 
glutamic acid (Glu) were changed from gaa (26.8) to gag 
(41.6); one codon encoding arginine (Arg) was changed from 
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aga (10. B) to cga (5.2) and one codon encoding arginine 
(Arg) was changed from agg (11.4) to egg (7.7); one codon 
encoding asparagine (Asn) was changed from aat (16.9) to 
aac (23.6); two codons encoding glutamine (Gin) were 
changed from caa (ll.s) to cag (32.7); one codon encoding 
serine (Ser) was changed from agt (a. 7) to tec (18.7); and 
one codon encoding alanine (Ala) was changed from gca 
(12.7) to gcc (29.8) . 

The techniques of oligonucleotide-directed site- 
specific mutagenesis employed to effect the modifications 
in structure or sequence of the dna molecule are known to 
those of skill in the art. The target DNA sequences which 
are to be mutagenized can be cDNA, genomic DNA or 
synthesized DNA sequences. Generally, these DNA sequences 
are cloned into an appropriate vector, e.g., a 
bacteriophage M13 vector, and single-stranded template DNA 
is prepared from a plaque generated by the recombinant 
bacteriophage. The single-stranded DNA is annealed to the 
synthetic oligonucleotides and the mutagenesis and 
subsequent steps are performed by methods well known in 
the art. gee., e.g., M. Smith and S. Gillam, in Genetir. 
Engineering; PrjnripTPs and Ntat-^ ., plenum Press 2:1-32 
(1981) (review) and T. Kunkel, Proc. Natl. Acad. Sci. USA 
82:488-492 (1985). Sfig alaa, Sambrook et al . (1989), 
SUEra. The synthetic oligonucleotides can be synthesized 
on a DNA synthesizer (e.g., Applied Biosystems) and 
purified by electrophoresis by methods known in the art. 
The length of the selected or prepared 

oligodeoxynucleotid.es using this method can vary. There 
are no absolute size limits. As a matter of convenience, 
for use in the process of this invention, the shortest 
length of the oligodeoxynucleotide is generally 
approximately 20 nucleotides and the longest length is 
generally approximately 60 to 100 nucleotides. The size 
of the oligonucleotide primers are determined by the 
requirement for stable hybridization of the primers to the 
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regions of the gene in which the mutations are to be 
induced, and by the limitations of the currently available 
methods for synthesizing oligonucleotides. The factors to 
be considered in designing oligonucleotides for use in 
oligonucleotide-directed mutagenesis (e.g., overall size, 
size of portions flanking the mutation (s) ) are described 
by M. Smith and S. Gillam in Genetic Engineering = 
Principles anfl MRt-hnijB , Plenum Press 1:1-32 (1981). In 
general, the overall length of the oligonucleotide will be 
such as to optimize stable, unique hybridization at the 
mutation site with the 5' and 3' extensions from the 
mutation site being of sufficient size to avoid editing of 
the mutation (s) by the exonuclease activity of the DNA 
polymerase. Oligonucleotides used for mutagenesis in the 
present invention will generally be at least about 20 
nucleotides, usually about 40 to 60 nucleotides in length 
and usually will not exceed about 100 nucleotides in 
length. The oligonucleotides will usually contain at 
least about five bases 3' of the altered codons. 

In the preferred mutagenesis protocol of the 
present invention, the INS containing expression vectors 
contain the BLUESCRIPT plasmid vector as a backbone. This 
enables the preparation of double- stranded as well as 
single- stranded DNA. Single- stranded uracil containing 
DNA is prepared according to a standard protocol as 
25 follows: The plasmid is transformed into a F' bacterial 
strain (e.g.. DHSaF ' ) . A colony is grown and infected 
with the helper phage M13-VCS [Stratagene #20025; IxlO 11 
pfu/ml] . This phage is used to infect a culture of the 
£2ii strain CJ236 and single- stranded DNA is isolated 
according to standard methods. 0.25 ug of single -stranded 
DNA is annealed with the synthesized oligonucleotides (5 
ul of each oligo, dissolved at a concentration of 5 
OD 260 /ml. The synthesized oligonucleotides are usually 
about 40 to €0 nucleotides in length and are designed to 
contain a perfect match of approximately 10 nucleotides at 
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each end. They may contain as many changes as desired 
within the remaining 20-40 nucleotides. The 
oligonucleotides are designed to cover the region of 
interest and they may be next to each other or there may 
be gaps between them. Up to six different 
oligonucleotides have been used at the same time, although 
it is believed that the use of more than six 
oligonucleotides at the same time would also work in the 
method of this invention. After annealing, elongation 
with T4 polymerase produces the second strand which does 
not contain uracil. The free ends are ligated using 
Ixgase. This results in double- stranded DNA which can be 
used to transform L fiflli strain HB101. The mutated 
strand which does not contain uracil produces double- 
stranded DNA, which contains the introduced mutations. 
Individual colonies are picked and the mutations are 
quickly verified by sequence analysis. Alternatively or 
additionally, this mutagenesis method can (and has been) 
used to select for different combinations of 
oligonucleotides which result in different mutant 
phenotypes. This facilitates the analysis of the regions 
important for function and is helpful in subsequent 
experiments because it allows the analysis of exact 
sequences involved in the INS. In addition to the 
exemplified mutagenesis of the INS-1 region of HIV-i 
described herein, this method has also been used to mutate 
in one step a region of 150 nucleotides using three 
tandemly arranged oligonucleotides that introduced a total 
of 35 mutations. The upper limit of changes is not clear, 
but it is estimated that regions of approximately 500 
nucleotides can be changed in 20% of their nucleotides in 
one step using this protocol. 

The exemplified method of mutating by using 
oligonucleotide-directed site-specific mutagenesis may be 
varied by using other methods known in the art. For 
example, the mutated gene can be synthesized directly 
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using overlapping synthetic deoxynucleotides (ass, e.g., 
Edge et al., Nature 2£2:756 (1981),- Nambair et al!, 
Science 221:1299 (1984); Jay et al., J. Biol. Chem. 
259:6311 (1984); or by using a combination of polymerase 
chain reaction generated DNAs or cDNAs and synthesized 
oligonucleotides . 

4. Determination of Stability of the 
Mutated mRNA 

The steady state level and/or stability of the 
resultant mutated mRNAs can be tested in the same manner 
as the steady state level and/or stability of the 
unmodified mRNA containing the inhibitory/instability 
regions are tested (e.g., by Northern blotting), as 
discussed in section l, above. The mutated mRNA can be 
analyzed along with (and thus compared to) the unmodified 
mRNA containing the inhibitory/ instability region (s) and 
with an unmodified indicator mRNA, if desired. As 
exemplified, the HIV-l pi7~ mutants are compared to the 
unmutated HIV-l pi 7 « in transfection experiments" by 
subsequent analysis of the mRNAs by Northern blot 
analysis. The proteins produced by these mRNAs are 
measured by immunoblotting and other methods known in the 
art, such as ELISA. Sss infra . 

VI - INDUSTRIAL AFPLIgABTT.T TV 

Genes which can be mutated by the methods of 
this invention include those whose mRNAs are known or 
suspected of containing INS regions in their mRNAs. These 
genes include, for example, those coding for growth 
factors, interferons, interleukins , the fos proto- oncogene 
protein, and HIV-l gag, env and pol, as well as other 
viral mRNAs in addition to those exemplified herein. 
Genes mutated by the methods of this invention can be 
expressed in the native host cell or organism or in a 
different cell or organism. The mutated genes can be 
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introduced into a vector such as a plasmid, cosmid, phage 
virus or mini- chromosome and inserted into a host cell or 
organism by methods well known in the art. In general 
the mutated genes or constructs containing these mutated 
genes can be utilized in any cell, either eukaryotic or 
prokaryotic, including mammalian cells (e.g., human (e g 
HeLa), monkey (e.g., Cos), rabbit (e.g., rab bit 
reticulocytes), rat, hamster (e.g., CHO and baby hamster 
kidney cells) or mouse cells (e.g., L cell,), plant cells 
yeast cells, insect cells or bacterial cells (e.g., e_ 
SOU) • The vectors which can be utilized to clone 'and/or 
express these mutated genes are the vectors which are 
capable of replicating and/or expressing the mutated genes 
in the host cell in which the mutated genes are desired to 
be replicated and/or expressed. See., e.g., p. Ausubel et 
a1 -' C^ent Prot-Qcols in nn 1pm1ar Greene 
Publishing Associates and Wiley- Interscience (1992) and 
Sambrook et al. U989) for examples of appropriate vectors 

for various types of host relic tv,„ »• 

ji«== ui nose ceils. The native promoters for 

such genes can be replaced with strong promoters 
compatible with the host into which the gene is inserted 
These promoters may be inducible. The host cells 
containing these mutated genes can be used to express 
large amounts of the protein useful in enzyme 
preparations, pharmaceuticals, diagnostic reagents, 
vaccines and therapeutics. 

Genes altered by the methods of the invention or 
constructs containing said genes may also be used for 
viva or in-vifro gene replacement. For example, a gene 
which produces an mRNA with an inhibitory instability 
region can be replaced with a gene that has been modified 
by the method of the invention ia^u to ultimately 
increase the amount of protein expressed. Such gene 
include viral genes and/or cellular genes. Such gene 
replacement might be useful, for example, in the 
development of a vaccine and/or genetic therapy. 
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The constructs and/or proteins made by using 
constructs encoding the exemplified altered gag, env, and 
pol genes could be used, for example, in the production of 
diagnostic reagents, vaccines and therapies for AIDS and 
AIDS related diseases. The inhibitory/instability 
elements in the exemplified HIV-1 gag gene may be involved 
in the establishment of a state of low virus production in 
the host. HIV-1 and the other lentiviruses cause chronic 
active infections that are not cleared by the immune 
system. It is possible that complete removal of the 
inhibitory/instability sequence elements from the 
lentiviral genome would result in constitutive expression. 
This could prevent the virus from establishing a latent 
infection and escaping immune system surveillance. The 
success in increasing expression of pi7« by eliminating 
the inhibitory sequence element suggests that one could 
produce lentiviruses without any negative elements. Such 
lentiviruses could provide a novel approach towards 
attenuated vaccines. 

For example, vectors expressing high levels of 
Gag can be used in immunotherapy and immunoprophylaxis , 
after expression in humans. Such vectors include 
retroviral vectors and also include direct injection of 
DNA into muscle cells or other receptive cells, resulting 
in the efficient expression of gag, using the technology 
described, for example, in Wolff et al. ( Science 247:1465- 
1468 (1990), Wolff et al., Human Mm ^mlar r,Pn*M 
l(6):363-369 (1992) and Ulmer et al. , Science 259:1745- 
1749 (1993). Further, the gag constructs could be used in 
transdominant inhibition of HIV expression after the 
introduction into humans. For this application, for 
example, appropriate vectors or DNA molecules expressing 
high levels of p55« or p37« would be modified to generate 
transdominant gag mutants, as described, for example, in 
Trono et al., Qe±l 59:113-120 (1989). The vectors would 
be introduced into humans, resulting in the inhibition of 
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HIV production due to the combined mechanisms of gag 
transdominant inhibition and of immunostimulation by the 
produced gag protein. In addition, the gag constructs of 
the invention could be used in the generation of new 
retroviral vectors based on the expression of lentiviral 
gag proteins. Lentiviruses have unique characteristics 
that may allow the targeting and efficient infection of 
non-dividing cells. Similar applications are expected for 
vectors expressing high levels of env. 

Identification of similar inhibitory/instability 
elements in SIV indicates that this virus may provide a 
convenient model to test these hypotheses. 

The exemplified constructs can also be used to 
simply and rapidly detect and/or further define the 
boundaries of inhibitory/ instability sequences in any mRNA 
which is known or suspected to contain such regions, e.g., 
in mRNAs encoding various growth factors, interferons or 
interleukins, as well as other viral mRNAs in addition to 
those exemplified herein. 

The following examples illustrate certain 
embodiments of the present invention, but should not be 
construed as limiting its scope in any way. Certain 
modifications and variations will be apparent to those 
skilled in the art from the teachings of the foregoing 
disclosure and the following examples, and these are 
intended to be encompassed by the spirit and scope of the 
invention . 
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LiE 1 
HIV- 1 GAG GENE 
The interaction of the Rev regulatory protein of 
human immunodeficiency virus type 1 (HlV-l) with its RNA 
target, named the Rev- responsive element (RRE) , is 
necessary for expression of the viral structure proteins 
(for reviews a££ G. Pavlakis and B. Felber, New Biol. 
2:20-31 (1990); B. Cullen and W. Greene, Cell ££:423-426 
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(1989) ; and C. Rosen and G. Pavlakis, AIDS J. 4:499-509 

(1990) ) . Rev acts by promoting the nuclear export and 
increasing the stability of the RRE - containing mRNAs. 
Recent results also indicate a role for REV in the 
efficient polysome association of these mRNAs (S. Arrigo 
and I. Chen, Gene Dev. 5:808-819 (1991), D. D'Agostino et 
al., Bfol. Cell Biol. 12:1375-1386 (1992)). Since the RRE- 
containing HIV-1 mRNAs do not efficiently produce protein 
in the absence of Rev, it has been postulated that these 
mRNAs are defective and contain inhibitory/instability 
sequences variously designated as INS , CRS, or IR <M. 
Emerman et al. Cell 52:1155-1165 (1989); S. Schwartz et 
al., J. Virol. ££:150-159 (1992); C . Rosen et al., Proc 
Natl. Acad. Sd. USA £5:2071-2075 (1988); M. Hadzopoulou- 
Cladaras et al., J. Virol. £3:1265-1274 (1989); F. 
Maldarelli et al., J. Virol. £5:5732-5743 (1991); A. W. 
Cochrane et al., J. Virol. £5=5305-5313 (1991}). The 
nature and function of these inhibitory/ instability 
sequences have not been characterized in detail. It has 
been postulated that inefficiently used splice sites may 
be necessary for Rev function (D. Chang and P. Sharp, Cell 
51:789-795 (1989)); the presence of such splice sites may 
confer Rev- dependence to HIV-1 mRNAs . 

Analysis of HIV-i hybrid constructs led to the 
initial characterization of some inhibitory/ instability 
sequences in the gag and pol regions of HXV-1 (S. Schwartz 
et al., J. Virol. ££:150-159 (1992); F. Maldarelli et al., 
J Virol 6^:5732-5743 (1991); A. w. Cochrane et al., J. 
Virol. £5:5305-5313 (1991)). The identification of an 
inhibitory/instability RNA element located in the coding 
region of the pl7™ matrix protein of HIV-i was also 
reported (S. Schwartz et al., J. virol. £6:150-159 
(1992)) . It was shown that this sequence acted in sis to 
inhibit HIV-l tat expression after insertion into a tat 
cDNA. The inhibition could be overcome by Rev- RRE, 
demonstrating chat this element plays a role in regulation 
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by Rev. 



z - Pi?'" expression r i««m^ 
To further study the inhibitory/ instability 
element in pi 7 « a pl7« expression plasmid (pi7, Fig. 1) 
was constructed. The pl7« sequence was engineered to 
contain a translational stop codon immediately after the 
coding sequence and thus could produce only pi 7 ™ {the 
construction of this plasmid is described below) . The 
major 5' splice site of HIV-l upstream of the gag AUG has 
been deleted from this vector (B. Felber et al., Proc. 
Natl. Acad. Sci. USA 8^:1495-1499 (1989)). To investigate 
whether plasmid pi7 could produce pi7« in the absence of 
Rev and the RRE, pl7 was transfected into HLtat cells (S. 
Schwartz et al., J. Virol, £1:2519-2529 (1990)) ( se e 
below) . These cells constitutively produce HIV-l Tat 
protein, which is necessary for transact ivat ion of the 
HIV-l LTR promoter. Plasmid pi7 was transfected in the 
absence or presence of Rev, and the production of pl7« 
was analyzed by western immunoblotting. The results 
revealed that very low levels of pi7« protein were 
produced (Fig. 2A) . The presence of Rev did not increase 
gag expression, as expected, since this mRNA did not 
contain the RRE. Next, a plasmid that contained both the 
pl7« coding sequence and the RRE (pl7R, Fig. i) was 
constructed. Like pi7, this plasmid produced very low 
levels of pi7« in the absence of Rev. High levels of 
P 17« were produced only in the presence of Rev (Fig. 2A) . 
These experiments suggested that an inhibitory/instability 
element was located in the pl7« coding sequence. 

Expression experiments using various eucaryotic 
vectors have indicated that several other retroviruses do 
not contain such inhibitory/ instability sequences within 
their coding sequences (see for example, J. Wills et al . , 
J. Virol. £1:4331-43 (1989) and V. Morris et al., J. 
Virol. 6^:349-53 (1988)). To verify these results, the 
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P17« (matrix) gene of HIV-l in plasmid pl7 was replaced 
with the coding sequence for pi 9 « (matrix) which is the 
homologous protein of the Rous sarcoma virus (RSV, strain 
SR-A). The resulting plasmid, p!9 (Fig. i) , was identical 
to plasmid pl7, except for the gag coding sequence. The 
production of pigm protein from plasmid pi9 was analyzed 
by western immunoblotting, which revealed that this 
plasmid produced high levels of pi9« (Fig. 2A) . These 
experiments demonstrated that the pi9« coding sequence of 
RSV, in contrast to pi7« of HIV-l, could be efficiently 
expressed in this vector, indicating that the gag region 
of RSV did not contain any inhibitory/instability 
elements. A derivative of plasmid pi9 that contained the 
RRE, named pl9R (Fig. i) was also constructed. 
Interestingly, only very low levels of pi9« protein were 
produced from the RRE -containing plasmid pi9R in the 
absence of Rev. This observation indicated that the 
introduced RRE and 3 ' HIV-l sequences exerted an 
inhibitory effect on pl9" expression from plasmid pl9R, 
which is in agreement with recent data indicating that in 
the absence of Rev, a longer region at the 3' end of the 
virus including the RRE acts as an inhibitory/ instability 
element (G. Nasioulas, G. PavlaJcis, B. Felber, manuscript 
in preparation) . m conclusion, the high levels of 
expression of RSV pi9« in the same vector reinforced the 
conclusion that an inhibitory/ instability sequence within 
HIV-l pi7« coding region was responsible for the very low 
levels of expression. 

It was next determined whether the 
inhibitory/instability effect of the pl7« coding sequence 
was detected also at the mRNA level. Northern blot 
analysis of RNA extracted from HLtat cells transfected 
with pl7 or transfected with pi7R demonstrated that pl7R 
produced lower mRNA levels in the absence of Rev (Fig. 3A> 
35 (See Example 3) . A two- to eight -fold increase in pl7R 
mRNA levels was observed after coexpression with Rev. 
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Plasmid pl7 produced mRNA levels similar to those produced 
by pl7R in the absence of Rev. Notably, Rev decreased the 
levels of mRKA and protein produced by mRNAs that do not 
contain RRE. This inhibitory effect of Rev in 
cotransfection experiments has been observed for many 
other non-RRE- containing mRNAs , such as lucif erase and CAT 
(L. Solomin et al., J. Virol 64:6010-6017 (1990); D. M. 
Benko et al . , New Biol 2:1111-1122 (1990)). These results 
established that the inhibitory element in gag also 
affects the mRNA levels and are in agreement with previous 
findings (S. Schwartz et al . , J. Virol. ££: 150-159 
(1992)). Quantitations of the mRNA and protein levels 
produced by pi7R in the absence or presence of Rev were 
performed by scanning densitometry of appropriate serial 
dilutions of the samples, and indicated that the 
difference was greater at the level of protein {60- to 
100-fold) than at the level of mRNA (2- to 8-fold) . This 
result is compatible with previous findings of effects of 
Rev on mRNA localization and polysomal loading of both gag 
and env mRNAs (S. Arrigo et al.. Gene Dev £: 808- 819 
20 (1991); D. D'Agostino et al., Mol. Cell. Biol. 12:1375- 

1386 (1992); M. Emerman et al . , Cell 57:1155-1165 (1989); 
B. Felber et al., Proc. Natl. Acad. Sci. USA f3_6_: 1495-1499 
(1989), M. Malim et al.. Nature (London) ^1^:254-257 
(1989)). Northern blot analysis of the mRNAs produced by 
the RSV gag expression plasmids revealed that pl9 produced 
high mRNA levels (Fig. 3B) . This further demonstrated 
that the pl9» coding sequence of RSV does not contain 
inhibitory elements. The presence of the RRE and 3' HIV-i 
sequences in plasmid pi9R resulted in decreased mRNA 
levels in the absence of Rev, further suggesting that 
inhibitory elements were present in these sequences. 
Taken together, these results established that gag 
expression in HIV-i is fundamentally different from that 
in RSV. The HIV-i p i7« coding sequence contains a strong 
inhibitory element while the RSV pi9« coding sequence 
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does not. Interestingly, plasmid pl9 contains the 5' 
splice site used to generate the RSV env mRNA, which is 
located downstream of the gag AUG. This 5' splice site is 
not utilized in the described expression vectors (Fig. 
3B) . Mutation of the invariable GT dinucleotide of this 
5' splice site to AT did not affect pi9« expression 
significantly (data not shown) . On the other hand, the 
HIV-l pl7 expression plasmid did not contain any known 
splice sites, yet was not expressed in the absence of Rev. 
These results further indicate that sequences other than 
inefficiently used splice sites are responsible for 
inhibition of gag expression. 

2. Mutated nil** vectg ya 
To investigate the exact nature of the 
inhibitory element in HIV-l gag, site-directed mutagenesis 
of the pl7 M coding sequence with four different 
oligonucleotides, as indicated in Fig. 4, was performed. 
Each oligonucleotide introduced several point mutations 
over an area of 19-22 nucleotides. These mutations did 
not affect the amino acid sequence of the $17** protein, 
since they introduced silent codon changes. First, all 
four oligonucleotides were used simultaneously in 
mutagenesis using a single -stranded DNA template as 
described (T. Kunkel, Proc. Natl. Acad. Sci. USA £2:488- 
492 (1985); S. Schwartz et al., Mol. Cell. Biol. 12:207- 
219 (1992)). This allowed the simultaneous introduction 
of many point mutations over a large region of 270 nt in 
vector pi7. A mutant containing all four oligonucleotides 
was isolated and named pl7Mi234. Compared to pl7 r this 
plasmid contained a total of 28 point mutations 
distributed primarily in regions with high AU-content. 
The phenotype of the mutant was assessed by transfections 
into HLtat cells and subsequent analysis of pl7« 
expression by immunoblotting. Interestingly, pl7M1234 
produced high levels of pl7« protein, higher than those 
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produced by pl7R in the presence of Rev (Fig. 2A) . This 
result demonstrated that the inhibitory/instability 
signals in pi7™ mRNA had been inactivated in plasmid 
P17M1234. As expected, the presence of Rev protein did 
not increase expression from P 17M1234, but instead, had a 
slight inhibitory effect on gag expression. Thus, pl7« 
expression from the mutant P 17M1234 displayed the same 
general properties as the pi9« of RSV, that is, a high 
constitutive level of Rev- independent gag expression. 
Northern blot analysis revealed that the mRNA levels 
produced by pl7M1234 were increased compared to those 
produced by pi7 {Fig. 3 A) . 

To further examine the nature and exact location 
of the minimal inhibitory/ instability element, the pl7« 
coding sequence in plasmid pi 7 was mutated with only one 
of the four mutated oligonucleotides at a time. This 
procedure resulted in four mutant plasmids, named piTMl, 
P17M2, pl7M3, and pl7M4, according to the oligonucleotide 
that each contains. None of these mutants produced 
significantly higher levels of pi7« protein compared to 
plasmid pl7 (Fig. 5), indicating that the 
inhibitory/instability element was not affected. The pl7 
coding sequence was next mutated with two oligonucleotides 
at a time. The resulting mutants were named pl7M12, 
pl7M13, pl7M14, pl7M23, pl7M24, and pl7M34. Protein 
production from these mutants was minimally increased 
compared with that from pi7, and it was considerably lower 
than that from pl7M1234 (Fig. 5) . In addition, a triple 
oligonucleotide mutant, P 17M123, also failed to express 
high levels of pl7« (data not shown) . These findings may 
suggest that multiple inhibitory/instability signals are 
present in the coding sequence of pl7«. Alternatively, a 
single inhibitory/ instability element may span a large 
region, whose inactivation requires mutagenesis with more 
than two oligonucleotides. This possibility is consistent 
with previous data suggesting that a 218 -nucleotide 
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inhibitory/instability element in the p!7« coding 
sequence is required for strong inhibition of gag 
expression. Further deletions of this sequence resulted 
in gradual loss of inhibition (s. Schwartz et al . , J. 
Virol. f£:150-159 (1992)}. The inhibitory/ instability 
element may coincide with a specific secondary structure 
on the mRNA. It is currently being investigated whether a 
specific structure is important for the function of the 
inhibitory/instability element. 

The pl7« coding sequence has a high content of 
A and U nucleotides, unlike the coding sequence of pi9™ 
of RSV (S. Schwartz et al. , J. Virol. £6:150-159 (1992); 
G. Myers and G. Pavlakis, in The Retrod ri j. Levy, 
Eds. (Plenum Press, New York, NY, 1992), pp. 1-37). Four 
regions with high AU content are present in the pl7™ 
coding sequence and have been implicated in the inhibition 
of gag expression (s. Schwartz et al., J. Virol. ££:iso- 
159 (1992) ) . Lentiviruses have a high AU content compared 
to the mammalian genome. Regions of high AU content are 
found in the gag/pol and env regions, while the multiply 
spliced mRNAs have a lower AU content (G. Myers and G. 
Pavlakis, in The Retrpviri c^p, J. Levy, Eds. (Plenum 
Press, New York, NY, 1992), pp. 1-37), supporting the 
possibility that the inhibitory/ instability elements are 
associated with mRNA regions with high AU content. It has 
been shown that a specific oligonucleotide sequence, 
AUUUA, found at the AU-rich 3' untranslated regions of 
some unstable mRNAs, may confer RNA instability (G. Shaw 
and R. Kamen, Cell 4£:659-667 (1986)). Although this 
sequence is not present in the pl7" sequence, it is found 
in many copies within gag/pol and env regions. The 
association of instability elements with AU-rich regions 
is not universal, since the RRE together with 3' HIV 
sequences, which shows a strong inhibitory/ instability 
activity in our vectors, is not AU-rich. These 
observations suggest the presence of more than one type of 
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inhibitory/instability sequences. In addition to reducing 
the AU content, some of the mutations introduced in 
plasmid P i7 changed rarely used codons to more favored 
codons for human cells . Although the use of rare codons 
could be an alternative explanation for poor HIV gag 
expression, this type of translational regulation is not 
favored by these results, since the presence of Rev 
corrects the defect in gag expression. In addition, the 
observation that the presence of non- translated sequences 
reduced gag expression (for example, the rre sequence in 
P17R) , suggests that translation of the 
inhibitory/instability region is not necessary for 
inhibition. Introduction of RRE and 3 ' HIV sequences in 
P17M1234 was also able to decrease gag expression, 
verifying that independent negative elements not acting 
co-translationally are responsible for poor expression. 

3. Identification and elimination of 

additional INS sequences in the p24 and pis 
.regi ons nf the era a rp .n«> ^_ ^ 

To examine the effect of removal of INS in the 
P 17« coding region (the pl7« coding region spans 
nucleotides 336-731, as described in the description of 
Fig. l. (B) above, and contains the first of three parts 
(i.e., pl7, p24, and pis) of the gag coding region, as 
25 indicated on in Fig. i. (A , ^ (B) , on the expression of 
the complete gag gene expression vectors were constructed 
in which additional sequences of the gag gene were 
inserted 3- to the mutationally altered pl7« coding 
region, downstream of the stop codon, of vector P 17M1234 
Three vectors containing increasing lengths of gag 
sequences were studied: P 17M1234 (731-1081) , P 17M1234 (731- 
1424) and P 17M1234 (731-2165), as shown in Fig. 1. ( C ) 
Levels of expression of pl7» „ er e measured, with the 
results indicating that region of the mRNA encoding the 
3S second part of the gag protein (i.e., the part encoding 
the P 24« protein, which spans nucleotides 731-1424) 
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contains only a weak INS , as determined by a small 
reduction in the amount of pi7« protein expressed by 
P17M1234 as compared with the amount of pl7« protein 
expressed by pl7M1234 (731-1424) , while the region of the 
mRNA encoding the third part of the gag protein (i.e., the 
part encoding the pl5« protein, which spans nucleotides 
1425-2165) contains a strong INS, as determined by a large 
reduction in the amount of gag protein expressed by 
P17M1234 (731-2165) as compared with the amount of protein 
expressed by pl7M1234 and pl7Ml234 (731-1424) . 



4. P37M1234 yp^lrn- 

The above analysis allowed the construction of 
vector p37M1234, which expressed high levels of p37« 
precursor protein (which contains both the pl7** and p24™ 
15 protein regions). Vector p37M1234 was constructed by 

removing the stop codon at the end of the gene encoding 
the altered pl7« protein and fusing the nucleotide 
sequence encoding the p24™ protein into the correct 
reading frame by oligonucleotide mutagenesis. This 
restored the nucleotide sequence so that it encoded the 
fused pl7« and p24™ protein (i.e., the p37« protein) as 
it is encoded by HXV-1. Since the presence of the p3 7« 
or of the p24« protein can be quantitated easily by 
commercially available ELISA kits, vector p37M1234 can be 
used for inserting and testing additional fragments 
suspected of containing INS . Examples of such uses are 
shown below. 



5 - Vectors Pl7M1234(7^- lQaims snd p55BM1234 

Other vectors which. were constructed in a 
similar manner as was P37M1234 were pl7Mi234 (731-1081) NS 
and P55BM1234 (Pig. i. (C ) ) . The levels of gag expression 
from each of these three vectors which allow the 
translation of the region downstream (3') of the pl7 
coding region, was respectively similar to the level of 
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gag expression from the vectors containing the nucleotide 
sequences 3' to a stop codon (i.e., vectors pl7M1234 ( 731 - 
1081}, pl7Ml234 (731-1424) and pl7M1234 (731-2165 ) , 
described above) . These results also demonstrate that the 
INS regions in the gag gene are not affected by 
translation or lack thereof through the INS region. These 
results demonstrate the use of pl7M1234 to detect 
additional INS sequences in the HIV-1 gag coding region 
(i.e., in the 1424-2165 encoding region of HIV-i gag). 
Thus, these results also demonstrate how a gene containing 
one or more inhibitory/ instability regions can be mutated 
to eliminate one inhibitory/ instability region and then 
used to further locate additional inhibitory/instability 
regions within that gene, if any. 

6 - Vectors p^TMi-ion ^ p^Mi-m 
As described above, experiments indicated the 
presence of INS in the P 24 and pis region of hxv-i in 
addition to those identified and eliminated in the P 17« 
region of HIV-1. This. is depicted schematically in Figure 
6 on page 7180 of Schwartz et al., J. virol. 66:7176-7182 
(1992). m that figure, cgagM1234 is identical to 
P55BM1234. 

By studying the expression of p24« protein in 
vectors encoding the p24~ protein containing additional 
gag and pol sequences, it was found that vectors that 
contained the complete gag gene and part of the pol gene 
{e.g. vector P 55BM1234, £ee_ Fig. 6 ) were not expressed at 
high levels, despite the elimination of INS-l in the pi?« 
region as described above. The inventors have 
hypothesized that this is caused by the presence of 
multiple INS regions able to act independently of each 
other. To eliminate the additional INS, several mutant 
HIV-l oligonucleotides were constructed (see Table 2) and 
incorporated in various gag expression vectors. For 
example, oligonucleotides M6gag, M7gag, MSgag and MIOgag 
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were introduced into P 37M1234, resulting in P 37M1-10D and 
the same oligonucleotides were introduced into p55BM1234, 
resulting in P 55BM1-10. These experiments revealed a 
dramatic improvement of expression of p37™ (which is the 
pl7« and p24™ precursor) and p55« (which is the intact 
gag precursor molecule produced by HIV-l) upon the 
incorporation in the expression vectors p37Ml234 and 
P55BM1234 of additional mutations contained in the 
oligonucleotides MSgag, M7gag, M8gag and MIOgag (described 
in Table 2) . Fig. 6 shows that expression was 
dramatically improved after the introduction of additional 
mutations . 

Of particular interest was p37Ml-10D, which 
produced very high levels of gag. This has been the 
highest producing gag construct (see Fig. 6) . 
Interestingly, addition of gag and pol sequences as in 
vectors p55BMl-10 and P 55AM1-10 (Fig. 6) reduced the 
levels of gag expression. Upon further mutagenesis, the 
inhibitory effects of this region were partially 
eliminated as shown in Fig. 6 for vector p55Ml-13P0. 
Introduction of mutations defined by the gag region 
nucleotides MIOgag, Mllgag, M12gag, Ml3gag, and pol region 
nucleotide MOpol increased the levels of gag expression 
approximately six fold over vectors such as p55BMl-10. 

The HIV-l promoter was replaced by the human 
cytomegalovirus early promoter (CMV) in plasmids p37Ml-10D 
and p55Ml-13P0 to generate plasmids pCMV3 7M1-10D and 
PCMV55ML-13P0, respectively. For this, a fragment 
containing the CMV promoter was amplified by PCR 
(nucleotides -670 to +73 , where +1 is the start of 
transcription, s^e., Boshart, et al., Cell . 41, 521 
(1985)) . This fragment was exchanged with the StuI - 
BssHII fragment in gag vectors p37Ml-10D and p55Ml-13P0, 
resulting in the replacement of the HIV-l promoter with 
that of CMV. The resulting plasmids were compared to 
those containing the HIV-l promoter after transfection in 
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human cells, and gave similar high expression of gag. 
Therefore, the high expression of gag can be achieved in 
the total absence of any other viral protein. The 
exchange of the. HIV- i wit h other promoters is beneficial 
if constitutive expression is desirable and also for 
expression in other mainmalian cells, such as mouse cells 
m which the HIV-l promoter ia weak. 

The constructed vectors P 37M1-10D and p55BMl-10 
can be used for the Rev independent production of p3 7™ 
and p 55 « proteins, respectively, m addition, these 
vectors can be used as convenient reporters, to identify 
and eliminate additional INS in different RNA molecules. 

Using the protocols described herein, regions 
have been identified within the gp 4 l (the transmembrane 
part of HIV-l env) coding area and at the post -env 3' 
region of HIV-l which contain ins. The elimination of INS 
from gag, pol and env regions will allow the expression of 
high levels of authentic HIV-l structural proteins in the 
absence of the Rev regulatory factor of HIV-l. The 
mutated coding sequences can be incorporated into 
appropriate gene transfer vectors which may allow the 
targeting of specific cells and/or more efficient gene 
transfer. Alternatively, the mutated coding sequences can 
be used for direct expression in human or other cells in 
vitro, or ia vivq with the goal being the production of 
high protein levels and the generation of a strong immune 
response. The ultimate goal in either case is subsequent 
protection from HIV infection and disease. 

The described experiments demonstrate that the 
inhibitory/instability sequences are required to prevent 
HIV-l expression. This block to the expression of viral 
structural proteins can be overcome by the Rev-RRE 
interaction, m the absence of INS, HIV-l expression 
would be similar to simpler retroviruses and would not 
require Rev. Thus, the INS is a necessary component of 
Rev regulation. Sequence comparisons suggest that the INS 
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element identified here is conserved in all HIV-i 
isolates, although this has not been verified 
experimentally. The majority (22 of 28) of the mutated 
nucleotides in gag are conserved in all HIV-i isolates, 
while 22 of 28 are conserved also in HIV- 2 (G. Myers, et 
al " Eds ' Human retroviruses and AIDS. a r omnilaMnn ,nH 
analysis of nucleic arid and a nHn„ acid SP ^ lon , Qg (Los 
Alamos National Laboratory, Los Alamos, New Mexico, 1991), 
incorporated herein by reference) . Several lines of 
evidence indicate that all lentiviruses and other complex 
retroviruses such as the HTLV group contain similar INS 
regulatory elements. Strong INS elements have been 
identified in the gag region of HTLV- I and SIV (manuscript 
in preparation) . This suggests that INS are important 
regulatory elements, and may be responsible for some of 
the biological characteristics of the complex 
retroviruses. The presence of INS in SIV and HTLV- 1 
suggests that these elements are conserved among complex 
retroviruses. Since INS inhibit expression, it must be 
concluded that their presence is advantageous to the 
virus, otherwise they would be rapidly eliminated by 
mutations . 

The observations that the inhibitory/instability 
sequences act in the absence of any other viral proteins 
and that they can be inactivated by mutagenesis suggest 
that these elements may be targets for the binding of 
cellular factors that interact with the mRNA and inhibit 
post transcriptional steps of gene expression. The 
interaction of HIV-l mRNAs with such factors may cause 
nuclear retention, resulting in either further splicing or 
rapid degradation of the mRNAs. it has been proposed that 
components of the splicing machinery interact with splice 
sites in HIV-l mRNAs and modulated mRNA expression (A. 
Cochrane et al., j. Virol. ££: 5305-5313 (1991); D. Chang 
and P. Sharp, Cell 5^:789-795 (1989); X. Lu et al., Proc. 
Natl. Acad. Sci. USA 87:7598-7602 (1990)). However, it is 



WO 93/20212 



PCT/US93/02908 



57 - 



10 



15 



20 



25 



30 



35 



not likely that the inhibitory/instability elements 
described here are functional 5' or 3 ' splice sites. 
Thorough mapping of HIV-1 splice sites performed by 
several laboratories using the Reverse Transcriptase- PCR 
technique failed to detect any splice sites within gag (S. 
Schwartz et al., J. Virol . £4:2519-2529 (1990); J. 
Guatelli et al., J. Virol. £1:4093-4098 (1990); E. D. 
Gerrett et al., j. virol. £5:1653-1657 (1991); M. Robert - 
Guroff et al., j. virol. 64:3391-3398 (1990); S. Schwartz 
et al., J. virol. £1:5448-5456 (1990); S. Schwartz et 
al., Virology 181:677-686 (1991)). The suggestions that 
Rev may act by dissociating unspliced mRNA from the 
splicesomes (D. Chang and P. Sharp, Cell £9:789-795 
(1989)) or by inhibiting splicing (j. Kjems et al., Cell 
£7:169-178 (1991)) are not easily reconciled with the 
knowledge that all retroviruses produce structural 
proteins from mRNAs that contain unutilized splice sites. 
Splicing of all retroviral mRNAs , including HIV-i mRNAs in 
the absence of Rev, is inefficient compared to splicing of 
cellular mRNAs (J. Kjems et al., Cell £2:169-178 (1991); 
A. Krainer et al., Gene Dev. 4:1158-1171 (1990); R. Katz 
and A. Skalka, Mol. Cell. Biol. 1^:696-704 (1990); C. 
Stoltzfus and S. Fogarty, J. Virol. £3:1669-1676 (1989)). 
The majority of the retroviruses do not produce Rev- like 
proteins, yet they efficiently express proteins from 
partially spliced mRNAs, suggesting that inhibition of 
expression by unutilized splice sites is not a general 
property of retroviruses. Experiments using constructs 
expressing mutated HXV-i gag and env mRNAs lacking 
functional splice sites showed that only low levels of 
these mRNAs accumulated in the absence of Rev and that 
their expression was Rev- dependent (M. Emerman et al 
Cell £2:1155-1165 (1989); B. Felber et al., Proc. Natl. 
Acad. Sci. USA ££: 1495 -1499 (1989); M. Malim et al . , 
Nature (London) 21£:254-257 (1989)). This led to the 
conclusion that Rev acts independently of splicing (B. 
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Felber et al., Proc. Natl. Acad. Sci. USA 86 : 1495-1499 
(1989); M. Malim et al . , Nature (London) 111:254-257 
(1989)} and to the proposal that inhibitory/instability 
elements other than splice sites are present on Hiv-l 
mRNAs (C. Rosen et al . , Proc. Natl. Acad. Sci. USA 
85:2071-2075 (1988); M. Hadzopoulou- Cladaras , et al . , J. 
Virol/ £3:1265-1274 (1989); B. Felber et al., Proc. Natl. 
Acad. Sci. USA .86:1495-1499 (1989)). 

Construction nf the Gag Expression Plasmids 
Plasmid pl7R has been described as pNL17R (S. 
Schwartz et al., J. Virol. ££:150-159 (1992)). Plasmid 
pl7 was generated from pi7R by digestion with restriction 
enzyme Asp718 followed by religation. This procedure 
deleted the RRE and HIV-i sequences spanning nt 8021-8561 
15 upstream of the 3' LTR. To generate mutants of pl7«, the 
pl7» coding sequence was subcloned into a modified 
pBLUESCRIPT vector (Stratagene) and generated single 
stranded uracil -containing DNA. Site-directed mutagenesis 
was performed as described (T. Kunkel, Proc. Natl. Acad. 
Sci. USA £2:488-492 (1985); S. Schwartz et al., Mol . Cell 
Biol. 12:207-219 (1992)). Clones containing the 
appropriate mutations were selected by sequencing of 
double- stranded DNA. To generate plasmid pl9R, plasmid 
pi7R was first digested with BssHll and EcoRl, thereby 
deleting the entire pl7«* coding sequence, six nucleotides 
upstream of the pl7« AUG and nine nucleotides of linker 
sequences 3' of the pl7« stop codon. The pl7« coding 
sequence in pl7R was replaced by a PCR- amplified DNA 
fragment containing the RSV pi9« coding sequence (R. 
Weiss et al., RNR Tumor Viruses. M olecular Biology of 
Tumpr Viruses (Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, 1985)). This fragment contained eight 
nucleotides upstream of the RSV gag AUG and the pl9« 
coding sequence immediately followed by a translational 
stop codon. The RSV gag fragment was derived form the 



20 



25 



30 



35 



WO 93/20212 



PCT/US93/029O8 



59 - 



infectious RSV proviral clone S-RA (R. Weiss et al rna 

Tumor Vim^ Mol^nl.r Biology ^ ^ » inrn (Cold 

Spring Harbor Laboratory, Cold Spring Harbor, New York, 
1985)) . pig was derived from pl9R by excising an Asp 718 
fragment containing the RRE and 3 ' HIV-l sequences 
spanning nt 8021-8561. 
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Tra^fect.jon q£ H r,tat f>n „ With r,a q Fvp r p SginT1 

HLtat cells (S. Schwartz et al., J Virol 
£4:2519-2529 (1990)) were transfected using the calcium 
coprecipitation technique (F . Graham et al. and A. Van der 
Eb, Virology 52:456-460 (1973)) as described (B. Felber et 
al., Proc. Natl. Acad. Sci . USA 1495 - 1499 (1989)) 
using 5 M g of p!7, p!7R, pl7M1234, pl 9 , or P 19R in the 
absence ( - ) or presence < + ) of 2 /xg of the Rev -expressing 
plasnud P L3crev (B. Felber et al., Proc. Natl. Acad. Sci 
USA £6:1495-1499 (1989)). The total amount of DNA in 
transfections was adjusted to 17 M g per 0.5 ml of 
precipitate per 60 mm plate using pQC19 carrier DNA. 
Cells were harvested 20 h after transfected and cell 
extracts were subjected to electrophoresis on 12.5% 
denaturing polyacrylamide gels and analyzed by 
immunoblotting using either human HXV-i patient serum 
CScripps) or a rabbit anti-pig- serum. pRSV-lucif erase 
(J. de Wet et al., Mol. Cell. Biol. 7:725-737 (1987)) that 
contains the firefly luciferase gene linked to the RSV LTR 
promoter, was used as an internal standard to control for 
transfection efficiency and was quantitated as described 
(L. Solomin et al . , j. Virol. £1=6010-6017 (1990)). The 
results are set forth in Fig. 2. 

Northern Blot Arpi Yff -i r 
HLtat cells were transfected as described above 
and harvested 20 h post transfection. Total RNA was 
prepared by the heparin/DNase method (2. KrawczyJc and C 
Wu, Anal. Biochem. 1^:20-27 (1987)), and 20 M g of total 
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RNA was subjected to northern blot analysis as described 
(M. Hadzopoulou-Cladaras et al. f j, Virol. 63:1265-1274 
(1989)). The filters were hybridized to a nick- translated 
PCR-amplified DNA fragment spanning nt 8304-9008 in the 
HIV-l 3' LTR. The results are set forth in Fig. 3. 

EXAMPLE! S> 
HIV-l ENV GENE 
Fragments of the env gene were inserted into 
vectors pig or P 37M1234 and the expression of the 
resulting plasmids were analyzed by transfections into 
HLtat cells. It was found that several fragments 
inhibited protein expression. One of the strong INS 
identified was in the fragment containing nucleotides 
8206-8561 ("fragment [8206-8561]"). To eliminate this 
INS, the following oligonucleotides were synthesized and 
used in mutagenesis experiments as specified supra . The 
fragment was derived from the molecular clone pNL43, which 
is almost identical to HXB2. The numbering system used 
herein follows the numbering of molecular clone HXB2 
throughout. The synthesized oligonucleotides follow the 
pNL4 3 s equence . 

The oligonucleotides which were used to 
mutagenize fragment [8206-8561], and which made changes in 
the env coding region between nucleotides 8210-8555 (the 
letters in lower case indicate mutated nucleotides) were: 

#1: 

8194-8261 

GAATAGTG CTGTTAAC cTc CTgAAcGC tAC cGC tATcGCcGTgGCgGAaGGaAC cGAc 
30 AGGGTTATAG {SEQ ID NO: 10) 

#2 

8262-8323 

AAGTATTACAAGC c GC cTAc cGcGCcATcaGaCAtAT cCC c cGc cG cATc cGc CAGGG 
35 CTTG (SEQ ID NO: 11) 
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#3 

8335-8392 



GCTATAAGATGGGc GG t AAaTGGag c AAg t c c t c cGTcATcGG cTGGCCTGCTGTAAG 
(SEQ ID NO: 12) 

5 #4 

8393-8450 

GGAAAGAATGcGcaGgGCcGAaCCcGCcGCcGAcGGaGTtGGcGCcGTATCTCGAGAC 
(SEQ ID NO: 13) 

10 # 5 

8451-8512 

CTAGAAAAACAcGGcGCcATt ACc tec t c t AAcACcGCcGCcAAt AAcGCcGCTTGTG 
CCTG (SEQ ID NO: 14) 

8513-8572 

G CTAGAAGCACAgGAaGAaGAgGAaGT cGG cTT c C C cGT t ACc CCTCAGGTACCTTTA 
AG (SEQ ID NO: 15) 



35 



The expression of env was increased by the 
elimination of the INS in fragment [8206-8561] as 
determined by analysis of both mRNA and protein. 

To further characterize in detail the INS in 
HTV-1 env, the coding region of env was divided into 
different fragments, which were produced by PCR using 
appropriate synthetic oligonucleotides, and cloned in 
vector P37M1-10D. This vector was produced from p37M1234 
by additional mutagenesis as described above. After 
introduction into human cells, vector P 37M1-10D produces 
high levels of p37" protein. Any strong INS element will 
inhibit the expression of gag if ligated in the same 
vector. The summary of the env fragments used is shown in 
Figure 11. The results of these experiments show that, 
like in HTV-1 gag, there exist multiple regions inhibiting 
expression in HXV-1 env, and combinations of such regions 
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result in additive or synergistic inhibition. For 
example, while fragments 1, 2, or 3 individually inhibit 
expression by 2-6 fold, the combination of these fragments 
inhibits expression by 30 fold. Based on these results, 
additional mutant oligonucleotides have been synthesized 
5 for the correction of env INS . These oligonucleotides 
have been introduced in the expression vectors for HIV-i 
env P 120pA and pl20R270 (see Fig. 7) for the development 
of Rev- independent HIV-1 env expression plasmids as 
discussed in detail below. 

10 

1. The mRNAs for gpl60 and for the 

extracellular domain (gpl20) are defective 
and their expression depends on the 
presence g£ RRE in cis and P^v in ^n. 

15 

1.1 Positive and Negative Determinants for 
env mRN A Expression of HTV 



Previous experiments on the identification and 

2Q characterization of the env expressing cDNAs had 

demonstrated that Env is produced from mRNAs that contain 
exon 4AE, 4BE, or 5E. (Schwartz et al., J. Virol. 
64:5448-5456 (1990); Schwartz et al., Mol . Cell. Biol. 
12:207-219 (1992). All constructs generated to study the 

25 determinants of env expression are derived from pNL15E. 

This plasmid contains the HIV-1 LTR promoter, the complete 
env cDNA 15E, and the HIV 3' LTR including the 
polyadenylation signal (Schwartz, et al. J. Virol. 
64:5448-5456 (1990) (Fig. 7). pNL15E was generated from 

3Q the molecular clone pNL4-3 (pNL4-3 is identical to pNL43 
herein) (Adachi et al., J. Virol. 59:284-291 (1986) and 
lacks the splice acceptor site for exon 6D, which was used 
to generate the tev mRNA (BenJco et al., J. Virol. 64:2505- 
2518 (1990) . The Env expression plasmids were transfected 

35 in the presence or absence of the Rev -expressing plasmid 

P L3crev (Felber et al., j. Virol. 64:3734-3741 (1990) into 
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HLtat cells (Schwartz et al., J. Virol. 64:2519-2529 
(1990), which constitutively express Tat (one-exon Tat) 
One day later, the cells were harvested for analyses of 
RNA and protein. Total rna was extracted and analyzed on 
Northern blots. Protein production was measured by 
Western blots to detect cell -associated Env. m the 
absence of Rev, NL1SE mRNA was efficiently spliced and 
produced Nef; in the presence of Rev, most of the RNA 
remained unspliced and produces the Env precursor gpieo, 
which is processed to gpi20, the secreted portion of the 
precursor and gp4l. 

To allow for the effects of INS to be 
distinguished and studied separately from splicing, splice 
sites known to exist within some of the fragments used 
were eliminated as discussed below. Analysis of the 
resulting expression vectors included size determination 
of the produced mRNA, providing the verification that 
splicing does not interfere with the interpretation of the 
data. 

1.2 Env expression is Rev-dependent also 
in the absence of functional splice 
sites * 



To study the effect of splicing on env 
expression, the splice donor at nt 5592 was removed by 

25 site-directed mutagenesis (changing GCAGTA to GaAtTc, and 
thus introducing an EcoRI site) , which resulted in plasmid 
15ESD- (Fig. 7) . The mRNA from this construct was 
efficiently spliced and produced a small mRNA encoding Nef 
(Fig. 8) . Sequence analysis revealed that this spliced 

30 mRNA was generated by the use of an alternative splice 
donor located at nt 5605 ( TACATgtaatg ) and the common 
splice acceptor site at nt 7 925 . m contrast to published 
work (Lu et al., Proc. Natl. Acad. Sci. USA 87:7598-7602 
(1990), expression of Env from this mutant depended on 

35 Rev. Next, the splice acceptor site was mutated at nt 

7925. Since previous cDNA cloning had revealed that in 
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addition to the splice acceptor site at nt 7925 there are 
two additional splice acceptor sites at nt 7897 and nt 
7901 (Schwartz, et al . J. Virol. 64:2519-2529 (1990), this 
region of 43 bp encompassing nt 7884 to nt 7926 was 
removed. This resulted in plSEDSS (Fig. 7} . Northern 
blot analysis of mRNA from HLtat cells transfected with 
this construct confirmed that the 15EDSS mRNA is not 
spliced (Fig. SB) . Although all functional splice sites 
have been removed from plSEDSS , Rev is still required for 
Env production (Fig. 8 A) . Taken together with data 
obtained by studying gag expression, these results suggest 
that the presence of inefficiently used splice sites is 
not the primary determinant for Rev- dependent Env 
expression, it is known that at least two unused splice 
sites are present in this mRNA (the alternative splice 
donor at nt 5605 and the splice donor of exon 6D at nt 
6269). Therefore, it cannot be ruled out that initial 
spliceosome formation can occur, which does not lead to 
■ the execution of splicing. It is possible that this is 
sufficient to retain the mRNA in the nucleus and, since no 
splicing occurs, that this would lead to degradation of 
the mRNA. Alternatively, it is possible that 
splice- site- independent RNA elements similar to those 
identified within the gag/pol region (INS) are responsible 
for the Rev dependency (Schwartz et al., J. Virol. 
66:7176-7182 (1992); Schwartz et al., J. Virol. 66:150- 
159 (1992) . 

1-3 Identification of negative elements 
within q t)120 mRNA 

To distinguish between these possibilities, a 
series of constructs were designed that allowed the 
determination of the location of such INS elements. 
First, a stop codon followed by the restriction sites for 
Nrul and Mlul was introduced at the cleavage site between 
the extracellular gpl20 and the transmembrane protein gp41 
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at nt 7301 in plasmid NL15EDSS , resulting in pl20DSS (Fig 
7) - Immunoprecipitation of gpi20 from the medium of cells 
transacted with piaoDSS confirmed the production of high 
levels of gpl20 only in the presence of Rev (Fig 9B ) 
The release of gpi20 is very efficient, since only barely 
detectable amounts remain associated with the cells (data 
not shown) . This finding rules out the possibility that 
the translation of the gp41 portion of the env cDNA is 
responsible for the defect in env expression. Next, the 
region 3' of the stop codon of gpi 20 (consisting of g P 4l 
including the REE and 3 ' LTR) with the SV40 
polyadenylation signal (Fig. 7) was replaced. This 
construct, P 120pA, produced very low levels of gpl 2 0 in 
the absence of Rev (Fig. 9B ) . Background levels of Env 
were produced from P 120DR (Fig. 7) . which was generated 
from pBS120DSS by removing the 5' portion of gp41 
including the RRE (Mlul to Hpal at nt 8200} (Fig 9B ) 
These results demonstrate the presence of a major INS-liJce 
sequence within the gpi 20 portion. To study the effect of 
Rev on this mRNA, different RREs (RRE330, RRE270, and 
RRED34S (Solominet al., J. Virol. 64:6010-6017 (1990) 
were inserted into P 120pA downstream of the gpi20 stop 
codon, resulting in pl20R330, pl 20 R270, and P 120RD345 
respectively (Fig. 7) . Immunoprecipitations demonstrated 
that the presence of Rev in trans and the RRE in cis could 
rescue the defect in the gpl20 expression plasmid. High 
levels of gpl20 were produced from pl20R330 (data not 
shown) , P120R270, and P120RD345 (Fig. 9B) in the presence 
of Rev. 

Northern blot analysis (Fig. 8A) confirmed the 
protein data. The presence of Rev resulted in the 
accumulation of high levels of mRNA produced by P BS120DSS 
P120R270, and pl20RD345. Low but detectable levels of RNA 
were produced from P 120DpA and pl20DR. 
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2. Identification of INS elements located 
within the env mRNA regions using two 
strateqjps 

To identify elements that have a down regulatory 
effect in vivo, fragments of env cDNA were inserted into 
two different test expression vectors, pi9 and p37Ml-10D. 
These vectors contain a strong promoter for rapid 
detection of the gene product, such as the HIV-l LTR in 
the presence of Tat, and an indicator gene that is 
expressed at high levels and can easily be assayed such as 
pl9 w of RSV or the mutated p37« gene of HIV-l 
(P37M1-10D), neither of which contains any known INS -like 
elements. Expression vector P 19 contains the HIV-l LTR 
promoter, the RSV pi9™ matrix gene, and HIV-l sequences 
starting at Kpnl (nt 8561) including the complete 3' LTR 
(Schwartz, et al., J. Virol. 66:7176-7182 (1992). Upon 
transfection into HLtat cells high levels of pl9gag are 
constitutively produced and are visualized on Western 
blots. Expression vector P 37M1-10D contains the HIV-l LTR 
promoter, the mutant p37gag (Ml-10) , and the 3' portion of 
the virus starting at Kpnl (nt 8561) . Upon transfection 
into HLtat cells this plasmid constitutively produces 
p37« that can be quantitated by the HIV-l p24« antigen 
capture assay. 

25 2 *! Identification of INS elements using 

the RSV gag expres sion wrst-nr 

INS elements within the gp41 and gpl20 portions 
were identified. To this end, the vector pi9 was used and 
the following fragments (Fig. 10) were inserted: (A) nt 
7684 to 7959; (B) nt 7684 to 7884 and nt 7927 to 7959; 
this is similar to fragment A but has the region of the 
splice acceptors 7A, 7B and 7 deleted; (C) nt 7595 to 7884 
and nt 7927 to 7959, having the splice sites deleted as in 
B; (D) nt 7939 to 8066; (E) nt 7939 to 8416; (F) nt 8200 
to 8561 (Hpal-Kpnl); (G) nt 7266 to 7595 containing the 
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intact RRE; (H) nt 5523 to 6190, having the splice donor 
SD5 deleted. 

Fragments A, B, and D did not affect Gag 
expression, whereas fragment G (RRE) decreased gag 
expression approximately 5x. Fragment C, E, and H lowered 
Gag expression by about 10 -20 -fold indicating the presence 
of INS elements . 

Interestingly, it was observed that the 
insertion of element F spanning 350 bp in plasmid pig 
abolished production of Gag, indicating the presence of a 
strong INS within this element. The presence of the RRE 
in cis and Rev in trans resulted in production of high 
levels of RSV P i9« Fragment F also had a smaller 
downregulatory effect on the expression of the 
INS-corrected pl7« of HIV-l ( P 17M1234) . These 
experiments revealed the presence of multiple elements 
located within the env mRNA that cause inhibition of pig- 
expression. 

2.2 Elimination of the INS within 
20 fragment- F 

Six synthetic oligonucleotides (Table 3) were 
generated that introduced 103 point mutations within this 
region of 330 nt without affecting the amino acid 
composition of Env. The mutated fragment F was tested in 
P19 to verify that the INS elements are destroyed. The 
introduction of the mutations within oligo#l only 
marginally affected the expression of pl 9 « whereas the 
presence of all oligos (#1 to #6) completely inactivated 

3Q the INS effect of fragment F. This is another example 

that more than one region within an INS element needed to 
by mutagnenized to eliminate the INS effect. 

It is noteworthy that this INS element is 
present in all the multiply S p liced Rev- independent mRNAs , 

^ such as tat, rev and nef . Experiments were performed to 
define the function of fragment F within the class of the 
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small mRNAs by removing this fragment from the tat cDNA 
in the context of this mRNA, this element confers only a 
weak INS effect (3-5- fold inhibition), which suggests 
that inhibition of expression in env mRNA may require the 
presence of at least two distinct elements. These results 
suggested that the INS effect within env is based on 
multiple interacting components. Alternatively, the 
relative location and interactions among multiple INS 
components may be important for the magnitude of the INS 
effect. Therefore, more than one type of analysis in 
different vectors may be necessary for the identification 
and elimination of INS. 

2.3. Identification of INS elements using 
P37M1-10D aiprpqa lon vp^ ^r 

The env coding region was subdivided into 
different consecutive fragments. These fragments and 
combinations of thereof were PGR- amplified using oligos as 
indicated in Fig. n and inserted downstream of the 
mutated p37« gene in P 37M1-10D. The plasmids were 
transfected into HLtat cells that were harvested the next 
day and analyzed for p24« expression. Pig. n showa tnat 
the presence of fragments 2, 3, 5 as well as the 
combination 1+2+3 lowered gag expression substantially. 
Different oligos (Table 4) were synthesized that change 
the AT- rich domains including the three AATAAA elements 
located within the env coding region by changing the 
nucleotide but not the amino acid composition of Bnv. in 
a first approach, these oligos 1-19 are being introduced 
into plasmid pl20R270 with the goal or producing gpl20 in 
a Rev- independent manner. Oligonucleotides such as oligos 
20-26 will then be introduced into the gp41 portion, the 
two env portions combined and the complete gpiso expressed 
in a Rev- independent manner. 
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EXAMPLE ^ 
PROTO- ONCOGENE C-FOS 
Fragments of the fos gene were inserted into the 
vector pi9 and the expression of the resulting plasmids 
were analyzed by transf ections into HLtat cells. It was 
found that several fragments inhibited protein expression 
A strong INS was identified in the fragment containing 
nucleotides 3328-3450 {"fragment [3328-3450]") 
(nucleotides of the fos gene are numbered according to 
Genebank sequence entry HUMCFOT, ACCESSION ft V01S12). In 
addition, a weaker element was identified in the coding 
region. 

To eliminate these INS the following 
oligonucleotides were synthesized and are used in 
mutagenesis experiments as specified supra. 

To eliminate the INS in the fos non- coding 
region, the following oligonucleotides, which make changes 
in the fos non-coding region between nucleotides [3328- 
3450] {the letters in lower case indicate mutated 
nucleotides), were synthesized and are used to mutagenize 
fragment [3328-3450]: mutagenesis experiments as specified 
sunra : 

#1: 

3349-3391 

TGAAAACGTT c g c aTGTGTcg cTAc gTTg cTTAc TAAGATGGA (SEQ ID NO- 
16) 

#2: 

3392-3434 

TTCTCAGATAc cTAg c TTcaTATTg c c TTaTTgTCTACCTTGA (SEQ ID NO- 
17) 

These oligonucleotides are used to mutagenize 
fos fragment [3328-3450] inserted into vectors p!9, 
P17M1234 or P 37M1234 and the expression of the resulting 
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plasmids are analyzed after transfection into HLtat cells. 

The expression of fos is expected to be 
increased by the elimination of this INS region. 

To further define and eliminate the INS elements 
m the coding region, additional longer fragments of fos 
are introduced into vector P 37M1234. The INS element in 
the coding region is first mapped more precisely using 
this expression vector and is then corrected using the 
following oligonucleotides : 

2721-2770 

GCCCTGTGAG t aGGCAc t GAAGGa cAGcCAt aCG t aACa tACAAGTGCCA (SEQ ID 
NO: 18) 



15 #2 

2670-2720 

AGCAGCMCAATGAaCCTagtagcGAtagcCTgAGtagcCCtACGCTGCTG (SEQ 
ID NO: 19) 

20 # 3 

2620-2669 

ACCCCGAGGCaGAtagCTTtCCatccTGcGCtGCcGCtCACCGCAAGGGC (SEQ ID 
NO: 20) 

25 # 4 

2502-2562 

CTGCACAGTGGaagCCTcGGaATCGGcCCtAT 
CTC (SEQ ID NO: 21) 

30 ThB expression of fos is expected to be 

increased by the elimination of this INS region. 
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EXAMPLE 4 
HIV-1 POL GENE 
Vector p37M1234 was used to eliminate an 
inhibitory/instability sequence from the pol gene of HIV-l 
which had been characterized by AW Cochrane et al 
"Identification and characterization of intragenic 
sequences which repress human immunodeficiency virus 
structural gene expression-, j. virol . 65:5305-5313 
(1991) . These investigators suggested that a region in 
pol (HIV nucleotides 3792-4052), termed CRS, was important 
for inhibition. A larger fragment spanning this region 
which contained nucleotides 3700-4194, was inserted into 
the vector P 3 7M1234 and its effects on the expression of 
P37gag from the resulting plasmid (plasmid P 37M1234RCRS) 
(see Fig. i 2 ) was analyzed after transfection into HLtat 
" cells. 

Severe inhibition of gag expression (10 fold 
see Fig. 13) was observed. 

In an effort to eliminate this INS, the 
following oligonucleotides were synthesized (the letters 
in lower case indicated mutated nucleotides) and used in 
mutagenesis experiments. 

First, it was observed that one AOTflJA potential 
instability element was within the INS region. This was 
eliminated by mutagenesis using oligonucleotide MlOpol and 
resulted in plasmid P 37M1234RCESP10 . The expression of 
gag from this plasmid was not improved, demonstrating that 
elimination of the ADOUA element alone did not eliminate 
the INS. ssfi Fig. 12. Therefore, additional mutagenesis 
was performed and it was shown that a combination of 
mutations introduced in plasmid P 37M1234RCRS was necessary 
and sufficient to produce high' levels of gag proteins, 
which were similar to the plasmid lacking CRS. The 
mutations necessary for the elimination of the INS are 
shown in Fig. 13. 

The above results demonstrate that HIV-l pol 
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contains INS elements that can be detected and eliminated 
with the techniques described. 

These results also suggest that regions outside 
of the minimal inhibitory region in CRS as defined by A.W. 
Cochrane et al., supra, influence the levels of 
expression. These results suggest that the RNA structure 
of the region is important for the inhibition of 
expression. 
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Table J, 

Correspondence between Sequence 
Identification Numbers and Mn c leotiri>B in Figure 4 



Seouence TD Nn* 


SEQ 


ID 


NO:l 


SEQ 


ID 


N0:2 


SEQ 


ID 


N0:3 


SEQ 


ID 


N0:4 


SEQ 


ID 


N0:5 


SEQ 


ID 


NO:6 


SEQ 


ID 


NO: 7 


SEQ 


ID 


NO:8 


SEQ 


TD 


NO:9 



Ficrure 4 



nucleotides 336-731 

nucleotides 402-452 

nucleotides 536-583, 

nucleotides 585-634, 

nucleotides 654-703, 

nucleotides 402-452, 

nucleotides 536-583, 

nucleotides 585-634, 

nucleotides 654-703, 



above line 
above line 
above line 
below line 
below line 
below line 
below line 



(Ml) 
(M2) 
(M3) 
(M4) 



Tabla 2 

Synthetic oligonucleotides used 
in the mutagenesis of tttv -i gaCT aTlri pol ^ain^ 

The upper sequence is the wild- type HIV-l as 
found in HlV^a while the bottom is the mutant 
oligonucleotide sequence. The location of the sequence is 
indicated in parentheses. 
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MSgag (77 8-82 4) 

NO ^ { ^^ ( ^^ A ^ T ^ ^TGGGTAAAAGTAGTAG AAGAGAAGG CT (SEQ ID 

xx X X X XXX 
CACCTAGAACc cTgAAcGC cTGGGTgAAgGTgGTAGAAG AGAAGGCT (SEQ ID 
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M6gag (871-915) 

CCACCCCACAAGATITAAACACCATGCTAAACACAGTGGGGGGAC (SEQ ID NO 

x XXX XXX X 

CCACCCCACAgGAccTgAACACgATGtTgAACACcGTGGGGGGAC (SEQ ID NO 

5 

M7gag (1105-1139) 

CAGTAGGAGAAATTTATAAAAGATGGATAATCCTG (SEQ ID NO: 26) 
CAGTAGGAGAgATcTAcAAGAGgTGGATAATCCTG (SEQ ID NO: 27) 

10 M8gag (1140-1175) 

GGATTAAATAAAATAGTAAGAATGTATAGCC CTACC (SEQ ID NO: 28) 
GGATTgAAc AAgAT c GTg AGgATGTATAG C C CTACC (SEQ ID NO: 29) 

M9gag (1228-1268) 
15 ACCGGTTCTATAAAACTCT^ACSAGCCGAGCAAGCTTCACAG <SEQ ID NO: 30) 

ACCGGTTCTAcAAgACcCTgcGgGCtGACSCAAGCTTCACAG (SEQ ID NO: 31) 
MlOgag (1321-1364) 

ATOTAAGACTA^AAAAGCArTGGGACCAGCGGCTACACTA (SEQ ID NO: 
20 X XX X X XX X X 

ATTGTAAGACcATcCTgAAgGCCcTcGGcCCAGCGGCTACACTA (SEQ ID NO: 

Mllgag (1416-1466) 

AGAGTTTTGGCTGAAGCAATGAGCCAAGTAACAAATTCAGCTACCATAATG 



ID NO: 34) — — -^~-i-««v-«Afti-ruAGCTACCATAATG (SEQ 
^^NoT'^T^^^^^^^^^^^^^^^^^^^^^^^'^ (SEQ 



30 



M12gag (147 0-15 20) 

^t^6^^^^ CC ^ A ^ TTGrrAAGTC '^CAATTGT (SEQ 

X XX XX X XX 

OGAGAGG^cTTccGGAACCAgcGgAAGATcGTcAAGTOTTTCAAlTCT ( SEQ 
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M13gag (1527-1574) 

G^G^^CACAGCCAGAAATTGCAGGGCCCCTAGGAAAAAGGGCTGT (SEQ ID 

XXX XX X 

N^^^ CCGC ^ Gg ^ CTCCC ^ CCCCCCGG ^5AAGGGCTGT (SEQ ID 

M14gag (1581-1631) 

£^3^*2^^ CSEQ 
'^f^g^^ (SEQ 

™£2;L ( 1823 ' 1879 } (K to R difference introduced) 
(Se£ iD^ 

XXX X X XX X 

(SE^N^ 

Mlpol (1936-1987) 

GATA& ^^ (SEQ 

X X X X X XXX XX 

^ TA £f^f^ Tc ^^ (SEQ 

M2pol (2105-2152) 

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^CA (SEQ ID 

xxxxxx XX 

CCTA^^CgGTgCCcGTgAAgTTgAAGCCgGGgAI^AT^CCCA (SEQ ID 



25 NO: 47) 
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M3.2pol (2162-2216) 
^^D^^^ AAGAAAAAATAA ^ 
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M4pol (2465-2515) 



20 



M4pol (2465-2515) 

'gf^^ (SEQ 
^ C ^^ ACAC * G ^ (SEQ 

M5pol (2873-2921) 

Nof ^^^^^^T^TrTACCCAGGGAITAAAG (SEQ ID 

XX X XX X X 

^AGTGGGGAAggTOAAcTGGGCgAGcCAGATcTACCCgGGGATTAAAG (SEQ ID 

M6pol (3098-3150) 

^^^f^^^^^^^TCTCAAAACAGG (SEQ 
S^N^^5^^^^^^^^^^^^^^^^^3^cCTGAAAACAGG (SEQ 

M7pol (3242-3290) 

TG^AAAGACTCCTAAATTTAAACTGCCCATACAAAAGGAAACATGGG (SEQ ID 
^AAAGACgCCgAAgTT^gCTGCCCATcCAgAAG^gACAT^ (SEQ ID 

M8pol (3520-3569) 

GAAGACTGAGTTAC^GCAATTTATCTAGCTITC^GGArrCGGGATrAG (SEQ ID 

xxxxxxxxx x 

25 ^5^ AGCT3 ^ gGC9ATcTAcCT ^ C 3=TOO i G<»cTCGGGA T rAG (SEQ ID 
M8.2pol (3643-3698) 

30 <™^?^f AT ^ 
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M9pol (3749-3800) 

OT^GTCCT^AAT^i^GTACTATrriTAGATCGAATAGATAAGGCCC (SEQ 

XX XX XX X X X X 

GTC^GTGCTGGgATCcGGAAgGTgCTAlTccTgGAcGGgATcGATAAGGCCC (SEQ 

M9.2pol (3806-3863) 

XX XXXX XXX x x x x 

MIOpol (3950-4001) 

GG^TC^ fSEQ 

xxxxxxxxxxxx 

ID^^^^GT)"^^^^^* ^^^Ac c TgGAgGGgAAgGTg ATC CTGGTAG (SEQ 
Mllpol (4031-4096) 

M12pol (4097-4151) 
^Q^f£^^ TA ^^^ 

xxxxxx XX X X 

25 ^^B T ^7if^ TC ^ C ^ 
M13pol (4220-4271) 

^AGTAGTAGAATCTATGAATAAAGAATTAAAGAA^ {SEQ 

30 ^TAGTA^TC^TGAA^ (SEQ 
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20 



S&t^ss. tindicates the f °- d * 

3 5 T^Q^ T S^^ TA ^ C 3 GAC ^ cGG ^CAAcTTCACtGG TO CTACGG 
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Table 3 



Sequences of mutant oligos designed 
to eliminate the IN5 .ff^ of fT -^ or ^ p 

The six oligonucleotides used to eliminate the 
INS effect of fragment F (oligos #1 to #6) are set forth 
above in Example 2 (SEQ. ID. NOS. 10-15), 



Table 4 

10 destrov e ^ n f? ° f J 111 ^ oligos designed to 

destroy TN.S p1 emends wnthin th e »t, v ^ i ng rRcHnr 

The wildtype {top) and the mutant oligo (below) 
of 26 different regions are shown, 
mutant oligos fnr env q£ wn/.i . 
15 Ml (5834-5878) 46-mer 

CTTGGGATGTTGATGATCTGTAGTGCTACAGAAT^A^ {SEQ ID N0 : 

CTTGGGATGcTGATCATcTC (SEQ ID NO: 

20 M2 (5886-5908) 24-mer 

ATTATGGGGTACCTGTGTGGAAG (SEQ ID NO: 77) 
X X X 

ATTATGGcGTgCC cGTGTGGAAG (SEQ ID NO: 78) 
M3 (592 0-59 56) 38-mer 

^ cktcwtt^ (seq id N0: 79) 

CACTCTATTc TGcG C cTC cGAcGC cAAgGCATATGAT (SEQ XD NO: 80) 
M4 (5957-5982) 27-mer 

ACAGAGGTACATAATCTTTGGGCCAC (SEQ ID NO: 81) 
ACAGAGGTgCAcAAcGT c TGGGC CAC (SEQ ID NO: 82) 
M5 (6006-6057) 53-mer 

ID^^^^ (SEQ 

xxxxxx XX X X X y 
^Sf^^^ ( SEQ 
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M6 (6135-6179) 46-mer 

TAACCCCACTCTGTGTTAGTTTAAAGTGCACTGATTTGAAGAATC (SEQ ID NO: 

X X X XX X X XX 

TAACCCCcCTCTGcGTgAGGCTgAAGTGCACcGAccTGAAGAATG (SEQ ID NO: 

M7 (6251-6280) 31-mer 

ATCAGCACAAG CATAAGAGGTAAGGTGCAG (SEQ ID NO- 87) 

X XX X X 

ATCAGCACcAGCATccGcGGcAAGGTGCAG (SEQ ID NO: 88) 

M8 (6284-6316) 34-mer 

GAATATGCATTTITOATA^ (SEQ ID NO: 89) 

GAATATGCcTT cTTcTAcAAgCTgGATATAATA (SEQ ID NO: 90) 
M9 (6317-6343) (28-mer) 

CCAATAGATAATGATACTACCAGCTAT (SEQ ID NO* 91) 
X X X x 

CCAATAGcTAAgGAcACcACCAGCTAT (SEQ ID NO: 92) 

15 M10 (6425-6 469) (46-mer) 

^CCCGGCTGCm^ (SEQ m N0; 

xxx xxxxxx 

94 ^^^^ C ^ C ^ c ^^^^ c ^9^9^cJAcAAcAAGACGTTC (SEQ ID NO: 

Mil (6542-6583) (42-mer) 
W CAACTG CTGTTAAATGG CAGTCTAGCAGAAGAAGAGGTAGTA (SEQ ID NO- 95) 

xxx xxxxx 

CAACTGCTGcTgAAcGGCAGcCTgGCcGAgGAgGAGGTAGTA (SEQ ID NO: 96) 
M12 (6590-6624) (35-mer) 

TCTGTCAATTTCACGGACAATG CTAAAACCATAAT (SEQ ID NO: 97) 
25 TCTGCCAAcTTCACcGACAAcGCcAAgACCATAAT (SEQ ID NO: 98) 
M13 (6€32-6663) (32-mer) 

CTGAACACATCTGTAGAAA1TAATTGTACAAG (SEQ ID NO: 99) 

xxxxxx 

CTGAACCAgTC cGTgGAgATcAAcTGTACAAG (SEQ ID NO: 100) 



30 



M14 (6667-6697) (31-mer) 

CAACAACAATACAAGAAAAAGAATCCGTATC (SEQ ID NO: 101) 

XXX XX X 
CAACAACAAcACcGGcAAg cG cATCCGTATC (SEQ ID NO: 102) 
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Ml 5 (6806-6852) (47-mer) 

GCTAG^TTAAGAGAACAAmGGAAATAATAAAACAATAATCTT (SEQ ID 

XXXXXXXXXXXXX 
N^^ 9CT9CGCGA9CA9TAcGG 9 AAcAAcA A3ACcATAATCrr (SEQ ID 

5 ms (nt 5917-6961) (45-mer) 

TOCTACTGTAATTCAACACAACTGTTTAATAGTACTTGGTrTAAT (SEQ ID NO: 

TOCTACTGgAAcTC^CcCAgCTGTTcAAcAGcACcTGGTTTAAT (SEQ ID NO: 

10 ™ 7 (nt 700 6-7048) (43-mer) 

CA^TCACCCTCCCATCCAGAATAAAACAAATTATAAACATG (SEQ ID NO: 

^CAATCACcCTgCCcTGCcG^A T LAgCAgATcATAAACATG (SEQ ID NO: 

M18 (nt 7084-7129) (46-mer) 
15 ^CAGTtK^CAAATTAGATGTTCATCAAATATTACAGGGCTCCTA (SEQ ID NO: 

xxxxxxxxxxx 

CATCAGCGGcCAgATccGcTGcTCcTCcAAcATcACcGGGCTGCTA (SEQ ID NO: 

M19 (nt 7195-7252) (58-mer) 
20 tslo^^ 3 * 3 **^^ 

X x XXXXXXXXYyyv 

(sl^o^!^ 

M20 (nt 7594-7633) (40-mer) 
^ GCCTTGGAATCCTAGTTGGAGTAATAAATCTCTGGAACAG (SEQ ID NO: 113) 

GCCTTGGAACGCCAGCTGGAGCAA^^TCCCTCGAACAG (SEQ ID NO: 114) 
M21 (nt 7658-7689) (32-mer) 

GAGTGGGACAGAGAAATTAACAATTACACAAG (SEQ ID NO: 115) 
GAGTGGGACcGcGAgATcAACAAcTACACAAG (SEQ ID NO: 116) 
30 M22 (nt 7694-7741) (48-mer) 

ATACACTCCITAATTCAAGAATCGCAAAACCAGCAAGAAAAG^ (SEQ ID 

NO A ^^^ C ^^^^^^^^^^^^^^^^^AAGAATGAA (SEQ ID 

35 
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M23 (nt 7954-7993) (40-mer) 

CAGGCCCGAAGGAATAGAAGAAGAAGGTGGAGAGAGAGAC (SEQ ID NO- 119) 

XXXXXXXX • 

CAGGC CCGAgGG c AT cGAgGAgGAgGG cGG cGAGAGAGAC (SEQ ID NO: 120) 

M24 (nt 8072-8121) (50-mer) 

TACCACCGCTTGAGAGACTTACTCTTCATTCTAACC^ (SE Q ID 

xxxxxx XX X 
M25 (nt 8136-8179) (44-mer) 

^^^AAGC C ^ ^ ( -^AATATTGGTGGAATCT C CTACAGTATTGG (SEQ ID NO: 

x XX XX 

^T^gGCCCTCAAgTAcTGGTGGAAcCTCCTcC^GTAT^ (SEQ ID NO: 

M26 (nt 8180-8219) (40-mer) 

AGTCAGGAACT AAAGAAT AGTGCTGTTAG C TTG CTCAATG (SEQ ID NO- 125) 

xx xxxxxx 

L5 AGTCAGGAgCTgAAGAAc AG c GC cGTgAa C cTG CTCAATG (SEQ ID NO: 126) 
CpmjriQpr.s: 

Although the vast majority of oligonucleotideg 
follow the HXB2 sequence, some exceptions are noted: 

In oligo M15, nt 6807 follows the pNL43 
sequence. (Specifically, nt 6807 is C in NL43 but A in 
HBX2.) Oligo M26 has the nucleotide sequence derived from 
pNL43 . 
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EXAMPLE 5 

USE OF OR P37M1-10D OR P55M1-13P0 IN 
IMMUNOPROPHYLAXIS OR IMMUNOTHERAPY 

In postnatal gene therapy, new genetic 
3Q information has been introduced into tissues by indirect 
means such as removing target cells from the body, 
infecting them with viral vectors carrying the new genetic 
information, and then reimplanting them into the body; or 
by direct means such as encapsulating formulations of DNA 
35 in liposomes; entrapping DNA in proteoliposomes containing 
viral envelope receptor proteins; calcium phosphate co- 
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precipitating DNA; and coupling DNA to a polylysine- 
glycoprotein carrier complex, m addition, in vivo 
infectivity of cloned viral DNA sequence after direct 
intrahepatic injection with or without formation of 
calcium phosphate coprecipitates has also been described 
mRNA sequences containing elements that enhance stability 
have also been shown to be efficiently translated in 
XSfl2BllS laevia embryos, with the use of cationic lipid 
vesicles. Sag, e.g.. j.a. Wolff, et al., Ssisnce 
247:1465-1468 (1990) and references cited therein. 

Recently, it has also been shown that injection 
of pure RNA or DNA directly into skeletal muscle results 
m significant expression of genes within the muscle 
cells. J.A. Wolff, et al., Scienrp 247:1465-1468 (1990) 
Forcing RNA or DNA introduced into muscle cells by other 
means such as by particle -acceleration (N. -S. Yang, et 
al - Prog- Naf r 1 , Acad, flrj TTfifl 87:9568-9572 (1990); S.R. 
Williams et al., Proc, Narl Acari fl^ , tto» 88:2726-2730 
(199D) or by viral transduction should also allow the DNA 
or RNA to be stably maintained and expressed. In the 
experiments reported in Wolff et al., RNA or DNA vectors 
were used to express reporter genes in mouse skeletal 
muscle cells, specifically cells of the quadriceps 
muscles. Protein expression was readily detected and no 
special delivery system was required for these effects. 
Polynucleotide expression was also obtained when the 
composition and volume of the injection fluid and. the 
method of injection were modified from the described 
protocol. For example, reporter enzyme activity was 
reported to have been observed with 10 to 100 fil of 
hypotonic, isotonic, and hypertonic sucrose solutions, 
Opti-MKM, or sucrose solutions containing 2mM CaCl, and 
also to have been observed when the 10- to 100- /il* 
injections were performed over 20 min. with a pump instead 
of within l min. 

Enzymatic activity from the protein encoded by 
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the reporter gene was also detected in abdominal muscle 
injected with the RNA or DNA vectors, indicating that 
other muscles can take up and express polynucleotides. 
Low amounts of reporter enzyme were also detected in other 
tissues (liver, spleen, skin, lung, brain and blood) 
injected with the RNA and DNA vectors. Intramuscularly 
injected plasmid DNA has also been demonstrated to be 
stably expressed in non-human primate muscle, s. Jiao et 
a 1 -' Hum. Bctp Th^py i.-^ , ? (1992) . 

It has been proposed that the direct transfer of 
genes into human muscle in sjUu may have several potential 
clinical applications. Muscle is potentially a suitable 
tissue for the heterologous expression of a transgene that 
would modify disease states in which muscle is not 
primarily involved, in addition to those in which it is. 
For example, muscle tissue could be used for the 
heterologous expression of proteins that can immunize, be 
secreted in the blood, or clear a circulating toxic 
metabolite. The use of RNA and a tissue that can be 
repetitively accessed might be useful for a reversible 
type of gene transfer, administered much like conventional 
pharmaceutical treatments. See j.a. Wolff, et al., 
Sfiiease 247:14S5-1468 (1990) and S. Jiao et al., Sum^Sgne 
Therapy 3:21-33 (1992). 

It had been proposed by j.a. Wolff et al., 
S3JE£a, that the intracellular expression of genes encoding 
antigens might provide alternative approaches to vaccine 
development. This hypothesis has been supported by a 
recent report that plasmid DNA encoding influenza A 
nucleoprotein injected into the quadriceps of BALB/c mice 
resulted in the generation of influenza A nucleoprotein- 
specific cytotoxic T lymphocytes (CTLs) and protection 
from a subsequent challenge with a heterologous strain of 
influenza A virus, as measured by decreased viral lung 
titers, inhibition of mass loss, and increased survival. 
J. B. Dlmer et al., SsiSflEe 259:1745-1749 (1993). 
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Therefore, it appears that the direct injection 
of RNA or DNA vectors encoding the viral antigen can be 
used for endogenous expression of the antigen to generate 
the viral antigen for presentation to the immune system 
without the need for self -replicating agents or adjuvants, 
resulting in the generation of antigen- specif ic CTLs and 
protection from a subsequent challenge with a homologous 
or heterologous strain of virus. 

CTLs in both mice and humans are capable of 
recognizing epitopes derived from conserved internal viral 
proteins and are thought to be important in the immune 
response against viruses. By recognition of epitopes from 
conserved viral proteins, CTLs may provide cross -strain 
protection. CTLs specific for conserved viral antigens 
can respond to different strains of virus, in contrast to 
antibodies, which are generally strain- specif ic . 

Thus, direct injection of RNA or DNA encoding 
the viral antigen has the advantage of being without some 
of the limitations of direct peptide delivery or viral 
vectors. £ej> j.a. uimer et al., a^, and the 
discussions and references therein). Furthermore, the 
generation of high- titer antibodies to expressed proteins 
after injection of DNA indicates that this may be a facile 
and effective means of making antibody -based vaccines 
targeted towards conserved or non- conserved antigens, 
either separately or in combination with CTL vaccines 
targeted towards conserved antigens. These may also be 
used with traditional peptide vaccines, for the generation 
of combination vaccines. Furthermore, because protein 
expression is maintained after DNA injection, the 
persistence of B and T cell memory may be enhanced, 
thereby engendering long-lived humoral and cell -mediated 
immunity. 
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Vectors for the immunoprophylaxis or 
immunor.hprapv ag^ nst htv-i 



The mutated gag genomic sequences in vectors 
P37M1-10D or P 55M1-13P0 (Fig. 6 ) will be inserted in 
expression vectors using a strong constitutive promoter 
such as CMV or RSV, or an inducible promoter such as 
HIV-l. 

The vector will be introduced into animals or 
humans in a pharmaceutical^ acceptable carrier using one 
of several techniques such as injection of DNA directly 
into human tissues; electroporation or transection of the 
DNA into primary human cells in culture £fi£ vivo.) , 
selection of cells for desired properties and 
reintroduction of such cells into the body, (said 
selection can be for the successful homologous 
recombination of the incoming DNA to an appropriate 
preselected genomic region) ; generation of infectious 
particles containing the gag gene, infection of cells £x 
vivs and reintroduction of such cells into the body; or 
direct infection by said particles in vivo. 

Substantial levels of protein will be produced 
leading to an efficient stimulation of the immune system. 

In another embodiment of the invention, the 
described constructs will be modified to express mutated 
gag proteins that are unable to participate in virus 
particle formation, it is expected that such gag proteins 
will stimulate the immune system to the same extent as the 
wild- type gag protein, but be unable to contribute to 
increased HIV-l production. This modification should 
result in safer vectors for immunotherapy and 
immunophrophylaxis . 
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D m T ^r^; 1 B HPaS!OT USING TRANSDOMINANT 
" TD-GAG-TD REV OR TD GAG - PRO - TD REV GENES 

5 Direct injection of DNA or use of vectors other 

than retroviral vectors will allow the constitutive high 
level of trans- dominant gag (TDgag) in cells, in 
addition, the approach taken by B.K. Felber et al 
Science 239:184-187 U988) will allow the generation of 

1Q retroviral vectors, e.g. mouse-derived retroviral vectors, 
encoding Hiv-l TDgag, which will not interfere with the 
infection of human cells by the retroviral vectors in 
Che approach of Felber. et al., it „ as shown that 

fragments of the Hiv-l ltr containing the promoter and 

15 part of the polyA signal can be incorporated without 

detrimental effects within mouse retroviral vectors and 
remain transcriptionally silent. The presence of Tat 
protein stimulated transcription from the HIV-1 LTR and 
resulted in the high level expression of genes linked to 

2Q the HIV-l LTR. 

The generation of hybrid TDgag - TDRev or TDgag- 
pro-TDRev genes and the introduction of expression vectors 
m human cells will allow the efficient production of two 
proteins that will inhibit HIV-l expression. The 

25 incorporation of two TD proteins in the same vector is 
expected to amplify the effects of each one on viral 
replication. The use of the HIV-l promoter in a matter 
similar to one described in B.K. Felber, et al., suEra., 
will allow high level gag and rev expression in infected 

30 cells, in the absence of infection, expression will be 
substantially lower. Alternatively, the use of other 
strong promoters will allow the constitutive expression of 
such proteins. This approach could be highly beneficial, 
because of the production of a highly immunogenic gag, 

35 which is not able to participate in the production of 
infectious virus, but which, in fact, antagonizes such 
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production. This can be used as an efficient 
immuniprophylactic or immunotherapeutic approach against 



AIDS. 



Examples of trans -dominant mutants are described 
in Trono et al., Cell 59:112-120 (1989). 

1. Generation of constructs encoding 
transdominant gag muta nt prot- r g-i,ng 

Gag mutant proteins that can act as trans - 
dominant mutants, as described, for example, in Trono et 
al . , supra , will be generated by modifying vector 
P37M1-10D or p55Ml-13P0 to produce transdominant gag 
proteins at high constitutive levels. 

The transdominant gag protein will stimulate the 
immune system and will inhibit the production of 
infectious virus, but will not contribute to the 
production of infectious virus. 

The added safety of this approach makes it more 
acceptable for human application. 

Those skilled in the art will recognize that any 
gene encoding a mRNA containing an inhibitory /instability 
sequence or sequences can be modified in accordance with 
the exemplified methods of this invention or their 
functional equivalents . 

Modifications of the above described modes for 
carrying out the invention that are obvious to those of 
skill in the fields of genetic engineering, protein 
chemistry, medicine, and related fields are intended to be 
within the scope of the following claims. 

Every reference cited hereinbefore is hereby 
incorporated by reference in its entirety. 



35 
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WHAT IS CLAIMED IS : 

1. A method for reducing the effect of 
inhibitory/instability sequences within the coding region 
of a mRNA, said method comprising the steps of: 

(a) providing a gene which encodes said 
mRNA; 

(b) identifying the inhibitory/instability 
sequences within said gene which 
encode said inhibitory/instability 
sequences within the coding region of 
said mRNA; 

(c) mutating said inhibitory/ instability 
sequences within said gene by making 
multiple point mutations; 

(d) transfecting said mutated gene into a 
cell; 

(e) culturing said cell in a manner to 
cause expression of said mutated gene; 

(f) detecting the level of expression of 
said gene to determine whether the 
effect of said inhibitory/instability 
sequences within the coding region of 
the mRNA has been reduced. 



2 . The method of Claim 1 further comprising 
the step of fusing said mutated gene to a reporter gene 
prior to said transfecting step and said detecting step is 
performed by detecting the level of expression of said 
reporter gene. 



30 3. The method of Claim 1 wherein step (b) 

further comprises the steps of 

(a) fusing said gene or fragments of said 
gene to a reporter gene to create a 
fused gene; 

^ (b) transfecting said fused gene into a 
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cell; 

culturing said cell in a manner Co 
cause expression of said fused gene- 
detecting the level of expression of 
said fused gene to determine whether 
the expression of said fused gene is 
reduced relative to the expression of 
said reporter gene. 

4. The method of Claim 3 wherein sten (*\ 
copses fusing said gene Qr ^ £-Pj*> 

the stop codon of said reporter gene. 

5. The method of Claim 3 wherein step (a) 
con^s fusing said gene ^ fraginents P (a, 

liz: che 3> end ° f the cod± ^ <* -Id 

reporter gene. 

6- The method of Claim i or 2 wherein said 
stating step changes the codons such that the amino acid 
sequence encoded by the mjoa ig VBtbaw ^ 

s^T abUity SegUeDCeS « e «- wherein 

d e r? either 6 or c 

cl ,T * ° r T ^ the final nucleotide 

SoTfaid 0 ; ° f / aid mtated iahlbit ^ 1- about 

50* A and T and about 50% a and c . 



a. The method of Claim 6 wherein at least 75V 

more preferred codons. 
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10. The method of Claim l or 2 wherein said 
mRNA encodes the GAG protein of a Rev- dependent complex 
retrovirus . 

11. The method of Claim 10 wherein the Rev- 
dependent complex retrovirus is human immunodeficiency 
virus- 1. 



12. A method of increasing the production of a 
polypeptide, wherein said polypeptide is encoded by a mRNA 
that contains one or more inhibitory/ instability 
sequences, said method comprising the steps of: 

(a) providing a gene which encodes said 
mRNA; 

15 (b) identifying the inhibitory/ instability 

sequences within said gene which 
encode said inhibitory/instability 
sequences within the coding region of 
said mRNA; 

2Q (c) mating said inhibitory/instability 

sequences within said gene by making 
multiple point mutations; 
(d) transfecting said mutated gene into a 
cell; 

25 (e) culturing said cell in a manner to 

cause expression of said mutated gene; 

(f) detecting the level of expression of 
said gene to determine that the effect 
of said inhibitory/instability 

30 sequences within the coding region of 

the mRNA has been reduced; 

(g) providing a host cell transfected with 
an expression vector containing said 
mutated gene; 

35 (h) culturing said host cell to cause 

expression of said polypeptide; and 
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(i) recovering said polypeptide. 

13. A method of producing polypeptides, whose 
native production is impeded by the presence of an 
inhibitory/instability sequence, comprising the steps of: 

(a) providing a host cell transfected with 
an expression vector containing a gene 
encoding said polypeptide, said gene 
having been mutated to decrease the 
effect of the inhibitory/instability 
sequence; 

(b) culturing said host cell to cause 
expression of said polypeptide; and 

(c) recovering said polypeptide. 

14. The method of Claim 13 wherein said host 
cell is prokaryotic. 

15 . The method of Claim 13 wherein said host 
cell is eukaryotic. 

16. The method of Claims 13, 14 or 15 wherein 
said gene is a cDNA. 

17 * method of Claims 13, 14 or 15 wherein 

z -> said gene is genomic. 

18. An artificial nucleic acid construct 
comprising a gene wherein the expression of the native 
gene is impeded by the presence of inhibitory/instability 
sequences in the mRNA encoded by said native gene, said 
gene having being mutated to decrease the effect of the 
inhibitory/ instability sequence. 

35 19 * The c ^struct of Claim 18 wherein the amino 

acid sequence encoded by said mutated gene is the same as 
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the amino acid sequence encoded by the native gene. 

20. The construct of Claim 19 wherein said 
native gene is HIV- l gag. 

21. The construct of Claim 20 wherein said HIV- 
1 gag gene has been mutated by the introduction of 
multiple point mutations between nucleotides 402 and 452, 
536 and 583, 585 and 634, and 654 and 703. 

22. The construct of claim 19 wherein said 
native gene is HXV-i env. 

23. An assay kit for identifying 
inhibitory/ instability sequences in a mRNA, comprising: 

(a) the nucleic acid construct of Claim 20 
or 21; and 

(b) a detection system for detecting the 
level of expression of said gene in 
said nucleic acid construct. 

24. The kit of Claim 23 wherein said detection 
system is an ELISA. 

25. An artificial nucleic acid construct 

25 comprising a gene mutated by the method of Claim 1 or 2 . 

26. A vector comprising the nucleic acid 
construct of Claim 25. 



15 



20 



30 



35 



27. A transformed host cell comprising the 
artificial nucleic acid construct of Claim 25. 

28. A vector comprising the nucleic acid 
construct of Claim 18 or 19. 
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29. A transformed host cell comprising the 
artificial nucleic acid construct of Claim 18 or 19. 

30. A transformed host cell of Claim 29 wherein 
said cell is selected from the group consisting of 
eukaryotes and prokaryotes. 

31. The host cell of Claim 3 0 wherein said cell 
is a human cell. 



32. The host cell of Claim 30 wherein said cell 
is a Chinese Hamster Ovary cell. 

33. The host cell of Claim 30 wherein said cell 
is E. coli . 



34. The construct of Claim 20 wherein said HIV- 
1 gag gene has been mutated by the introduction of 
multiple point mutations between nucleotides 402 and 452, 
536 and 583, 585 and 634, 654 and 703, 871 and 915, lios' 

20 and 1139, 1140 and 1175 and 1321 and 1364. 

35. The construct of Claim 34 wherein said HTV- 
1 gag gene is p37Ml-10D. 

25 

36. The construct of Claim 20 wherein said HIV- 
l gag gene has been mutated by the introduction of 
multiple point mutations between nucleotides 402 and 452, 
536 and 583, 585 and 634, 654 and 703, 871 and 915, HQs' 
and 1139, 1140 and 1175, 1321 and 1364, 1416 and 1466, 

30 147 0 and 1520, 1527 and 1574, and 1823 and 1879. 

37. The construct of Claim 36 wherein said HTV- 
1 gag gene is p55Ml-i3P0. 



35 



38. A vaccine composition for inducing immunity 
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in a mammal against HIV infection comprising a 
pharmaceutical^ acceptable medium and further comprising 
a therapeutically effective amount of a nucleic acid 
construct capable of producing HIV gag protein in the 
absence of any HIV regulatory protein in a cell in vivo. 

39. A vaccine composition according to claim 38 
wherein said mammal is a human. 

40. A vaccine composition according to claim 38 
wherein said regulatory protein is HIV-1 Rev. 



41. A vaccine composition according to claim 3 8 
wherein said construct is selected from the group 
consisting of the construct of claim 20, 21, 34 35 36 
J-5 and 37. ' 



42 . A method for inducing immunity against HIV 
infection in a mammal which comprises administering to a 
mammal a therapeutically effective amount of a vaccine 
composition comprising a nucleic acid construct capable of 
producing HIV gag protein in the absence of any HIV 
regulatory protein in a cell in. vivo , 

7 43. A method according to claim 42 wherein said 

mammal is a human. 



30 



44. A method according to claim 42 wherein said 
regulatory protein is HIV-1 Rev. 

45. A method according to claim 42 wherein said 
construct is selected from the group consisting of the 
construct of claim 20, 21, 34, 35, 36, and 37. 
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1 TGGAAGGGCT AATTTGG7 



71 



CAAAAAAGAC AAGAGATCCT TGATCTGTGG ATCTACCACA CACAAGGCT; 



CTTCCCTGAT TGGCA GAACT ACACACCAGG GCCAGGGATC AGATATCCAC 
AAGTTAGTAC CAGTTSAACC AGAGCAAGTA GAAGAGGCCA 



211 CEATGAGCCA 
261 
351 

421 GCTftC AJATA AGCAGCTGCT 
491 
561 



AATAAGGAGA GAAGAACAGC TTGTTACACC 



GCATGGGATG GAGGACCCGG AGGGAGAAGT ATTAGTGTGG AAGTTTGACA 



GCCTCCTAGC 



AGCTGCATCC GGAGTACiaC AAAGACTGCT GACATCGAGC TTTCTACAAG 
TCCAGGGAGG TGTGGCCTGG GCGGGACTGG GGAGTGGCGA GCCCTCAGAT 



TGCCTGT ACTGGGTCTC TCTGGmca CCAGATCTGA GCC TGGGAGC 
CCACTGCTTA AGCCTCAATA AAGCTTGCCT TGAGTGCTCA AAGTAG TGTG 
TGCCCGTCTG TTGrGTGACT CTGGTAACTA GAGATCCCTC ACACCCTTTT AGTCAGTGTG GAAAATCTCT 



«31 AGCAGTCGCG CCCGAACAGGGA CTTGAAAG CGAAAGTAAA GCCAGAGGAG 



ATCTCTLGAC GCAGGACTCG 



Bs»Hll (711) 

— — — _ ga * f L » uT y fA «nThf Val Al aThr L*uTy iCy V* 1 HI iGl n A rg M » 

UA, pLy » II ig u g u<a ua nA»nLy«Sr Ly iLyilyt A l «GI nQJ nA 

- '-.ua.faufflyA l.ThfProanA.pL.uAtn-n.rMiiLtuA.i.HirV.iayaiy H 

I ' «L.uA«pi I »Af Q q na yProly.a uProPh.ArgAi 
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P V rval AspArgP heTy rLysThr LauArgAI «a uGl nAI aS.rGl nQ uVal LysAsnTrpM,, Thr a u Thr 
LauLeuVaf a nAt nAI aAsnProAapCysLysThr 1 1 .L.ulyAI iUuS yProAl aAl a Thr L»uG u ffl uM 

atM>1 Thr Al * c y«Ql nQ yV,| a yG | yProa yHi sLyaAi aArgValL.u 



Xbal (1893)1 
JTTCTAGACT 



Apal (1MB) ~~~ 



1841 CGAGGGGGGG CCCGGTACCT TTAAGACCAA TGACT^CAA GGCAGCTSTA ouciaeec ACTTTTIAAA 
«U AGAAAAGGGG SGA CTGGAAS GGCTAATTCA CTCCCAAAGA AGACAAGA1A TCCTTGATCT CTGGAI^ 
1»H- CACACACAAG GCTACTTCCC TGATTGGC AG AACIACACAC OUMGCCAGG GGTCAGAIAT CCACTGACCT 
2051 CIACAA GCTA GTACCAGITG AGCCASSAA GSTAGAAGAG GCCAATAAAG GAGAGAACAC 

2121 taSm °™ CACCCT STGA GCCTCCATCG AATGGAICAC CCTGAGAGAC AAGTSTTASA GTGGAGSnT 
2191 TAGCAm CA ICACGIGGCC CGAGAGCTGC ATCCGGAGIA CTTCAAGAAC TGCTGACAIC 

2261 SMcm ^ CAASca Acrr tccgctcggg actttccags gaggcgtggc ctgggcggga ctggggagig 

2331 (XXMXK! < : AGATGC TGCA TATAAGCAGC TGCTTTTTGC CTGTACTGGG ICTCICIGGT TAGACCAGAT 
2401 ° aan8! GAGCTC TCTG GCTAACIAGG GAACCCACW CTTAAGCCTC AATAAASCTT SCOTaCK 
2471 "TP* 8 ™ IGTWG CCCG TCTGTTGIGI GACTCTGGTA ACIAGAGATC CCTCACACCe TTTTAGTCAG 
XGIGGAAAAT OCTAGOLCC CCCCAGGAGG IMAGGITGC AGTGAGCCAA GATCGCGCCA CTGCATTCCA 

£££££ ? XTAXT * XT ««™«5GG «»»» AXTTAIACAT 

CCOO^ £££££££ aSSSf SCICACACCT GCGCCCGGCC CTTTGGSAGG 

i^wSS S^AGaIS SSSSS C T^? ttCC «WMGACCA ACATGGaGAA ACCCCITCTC 

28S1 itt c ciS £ £££2^ nwiraiT CACAGGIAIT iciggaaaac tgaaactgtt 

2361 AGGGAgStJ SSSS 6 * 8 <a*»="CT OKATCAAAT GIGGTGGSAG 

3031 SaaCaSH SgSg^ ifTClWCS CAGACTCGGC GSGTGTCCl'J' CGGJTCAGTT 

3101 a!a£S S££ £5^2 AGGGOTAGT CCCCAAGACA TAAACACCCA 

3171 Qoarma* agagtaagtt DS^SSS SS 10 " 101 «=*««»! attcaccaag acgggaaita 

3241 Sg££!i ATCAGAGTTT SSS SSJSSS ™ 

3311 ATGAAAGCGT AGGCAGTrS ^7^" UJ * ,lWI AGGGGGCCAG IGAG1TGGAG 

3381 £gag£a« ££2£££ SEfEE? 8 GGTGSGGGCC ACAAGATCGG 

3451 £tqSw StSg^S ??" TCai SGMWCAGG GTCTGCAAAA ZA3CTCAAGC 

3521 AGCCrOAG? ££££££ aCTAOAI " 5GGGW0CTC AGAATCTTGT 

^ «**TKn«c IGCUGACte OAAACCAIA AITTCTTTIT KlUjmi illllAiTlT IGAGAQUjGG 

11% S S S££ a™™* 

3731 TCGTGACTGG oSaaSS? ,y"^°* T * ? TG * SICaIA TTACAATTCA CTGGCCGTCG TTTTACAACG 

llS rcSJSS i^SSI^ S^E" ^TCW^ITT TTOGGGTCC AGGTGCCCTA AAGCACTAAA 

IS S^S^S SiSS?" 5 A 5 ^ 5 ^ GCAAAGCCGG CGAACGTGGC GXGAAASGU. 

IS SSJS^ ££££££ SSfSS SS"^ GTrafi ^ CACGCTOCGC GnACCACCA 

4351 G E?" 3 ^ GCGCGTCCC * GGTGGCACTT TTCGGGGAAA TGTCCCCGGA 

4361 ACCCCttTTT GIIlAiinT CTAAAIACAT TCAAAIATGT ATCCCCTCAT GAGACAATAA CCCTCATAAA 



2541 

2511 
2681 
27S1 
2821 
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4431 
4501 



SSS £££££ ~^f A ACATTTCCGT GTCGCCCTTA TTCCCT^ 

«7l TTGGGTgS £££££ ZSSSZZ 2 KTGAAAG TAAAAGATGC TGAAGATCAG 

4 641 AAGAACGTIT ' TCCAATGATG SSSS SS™* 1 CCTTGA ^CT TTTCGCCCCG 

4711 CGGGCAAGAG QIACTCGGTC ATGTGGCGCG GTATTATCCC GTATTGACGC 

4781 GAAAAGCATC SEES S £££££ SS?*™ 

4851 CTGCGGCCAA CTTACTTCTG ACAACGATCG £££££1 SSSS ° lT ^ CCR TG AGTGA&ACA 
4921 GGATCATGTA ACTCGCCITG ATCGTTOgS ^SSSJ SIS"*** GCTTTTTTGC ACAACATGGG 
4991 ACCACGATGC CTgSgGAAT OTcSSSS iSSS! AATGAAGCCA ^VCCAAACGA CGAGCGTGAC 
5061 CCCGGCAACA AiSS ££££££ £££SS ^g?™ 

5131 GGCTGGCTGG ITOTTGCTG AIAAATCTgI AfiSSSS £2^222 CTTCTGCGCT CGGCCCTTCC 
5201 GGGCCAGATG GTAAGCCCTC CCGTAtcS? JSSSJif f 1555 ^ CCCGTATCAT TGCAGCACTG 
5271 GAAATAGACA GATCG^G SSSSJ 2*252^ ATGGAIGAAC 

5341 A1ATMACTT ^GATTGATT iSSSS SSSSi f 0 ™™ CICTCAGACC AAGTTTACTC 
5411 AATCTCATGA SaJUuScS ££££££ SSS^S l*™**™ AGGTGAAGAT CCTTTTTGAT 
5481 AAGGATCTTC £££££ £££££ JSSS?^ <*AAAGATCA 

5551 AGCGGTGGTT TCTTTGCCGG ATCAAGAGOT A^SJJSS g?CTTCCAA ACAAAAAAAC CACCGCIACC 
5621 CAGATACCAA aSSgtcS ££££££ S^S** O^CGCTT CAGCAGAGCG 

5691 CTACA1ACCT SctCTgSa £££££ 2^?°^ CWGAACTCT G3AGCACCGC 

5761 GTTGGACTCA AGACGATAGT ££££££ SS^S JSSSSf GTCTTACCGG 
5831 CCCAGCTTGG AGCGAACGAC CTACAffiSJ TCGGGCTt » A CGGGGGGTTC GTGCACACAG 

5901 TTCCCGAAGG GaSS £££££ S£££2 SSSSS AGCGCCACGC 
5971 GCTTCCAGGG GGAAACGCCT GGXATmS SS£22 SSSS^ GCACGAGGGA 
6041 TTTTTGTGAT GCTCGTCAGG GGGGcSS SSSflJ GGCTTTCG CC ACCTCTOACT TGAGCGTCGA 
Oil TCGCCTTTTG CTGg£££ TTACGGTTCC 
6181 TACCGCC1TT GAGTGAGCTG ATACrSrrr r^«S^ GT7ATCCCCT GATTCTGTGG ATAACCGTAT 
6251 GAAGCGGAAG AGCG^S ££££££ Sggf^ GCAGCGAGTC AGTGAGCGAG 

6321 ACGACAGGTT TCCCGACTOG mScSgS SSSS SSf™^ &mmA TGCAGCTGGC 
6391 GGCACCCCAG SSSSS J££g^£ CiaU =™GC TCACTCATXA 

6461 CACACAGGAA AC»Gc£tGA SSS TTGTGAGCGG AIAACAAITr 

P«l (6533) w-TOAITAC GCCAAGCTCG GAATTAACCC TCACEAAACG GAACAAAAGC 

US SSS TCAGCCTGCC ^^CXTd AGCGGCTGCC 

6671 GAGCGGTAAG ACTgSaGM SSSSf SSSSS TCAGCCAACC TCTGAAAC1A GGTGCGCACA 
6741 CAGCCTCACA OGGGotS? 2£S£S GGGT ™ TCA CAGTGCACCC TGACAGTCGT 

6811 ACCCTTACAA TCMTOATr SSSSf SSSS ^C"**** CAGGGGGTTT ATCACAGTGC 
6881 TGATCAGAGG TGTGTTCcS ££££££ CTCTACTGTG CC1XACTTGT AAGTTAAAIT 

6951 CACCTGGGTC TTGGAATGTG AGGGTTCAGT ACTATCGCAT TTCAGGCCTC 

7021 ACACAAGATA ACCAaSSc SSS^ ACCTCAGTTG GATCTCCACA GGTCACAGTG 

7091 AACTGCCATG TCGGA^ SSS^ CACCTC °^ TGGCCGGAGG 

7161 GCTTCCAGCC ATCCACCTGA tSSa^ ^fiS^ ^a**** GACCAGCGCA 

xwiauuaac CIAGCGAAAG CCCCAGTTCT ACTTACACCA GGAAAGGC 
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